Postgres FM

Nikolay and Michael discuss the track_planning parameter of pg_stat_statements — what it is, how it affects performance, and when or whether you should switch it on.
 
Here are some links to things they mentioned:

~~~

What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!

~~~

Postgres FM is produced by:

With special thanks to:
  • Jessie Draws for the elephant artwork 

Creators & Guests

Host
Michael Christofides
Founder of pgMustard
Host
Nikolay Samokhvalov
Founder of Postgres AI

What is Postgres FM?

A weekly podcast about all things PostgreSQL

Michael: Hello and welcome to Postgres.FM, a weekly show about

all things PostgreSQL.

I am Michael, founder of pgMustard, and this is my co-host Nikolay,

founder of Postgres.AI.

Hey Nikolay, what are we talking about today?

Nikolay: Hi Michael, let's talk about performance cliffs and

1 case particularly, track_planning.

The reason why I brought this to our attention is that I observed

recently several, actually over the last few years, I observed

several strong cases in production systems, various production

systems, where lack of planning phase tracking caused huge amount,

huge effort invested to troubleshooting.

And you know, our best example is, we talked about this many

times, during planning Postgres locks all indexes with access

share lock.

All tables, all indexes involved, even those which will not be

used, it still locks them, right?

And we talked a lot about lightweight lock, LockManager contention

issues, right?

And it happens during planning.

And if we don't have track_planning, pg_stat_statements.track_planning

enabled, it's not visible to our top-down query analysis.

We cannot see which queries spend a lot of time in planning and

a lot of buffers, maybe, right, and so on.

Michael: Yes, and crucially, not only is pg_stat_statements not

on by default although a lot of people have it turned on but

pg_stat_statements track timing is off by default even when you

enable pg_stat_statements so most folks that enable pg_stat_statements

don't enable track_planning including most cloud providers.

Nikolay: So let's zoom out, actually.

This is a good point.

By default, Postgres presents only poor metrics for query analysis.

If you have I/O timing enabled, in the pg_stat_database you see

I/O timing.

You have a number of transactions, you can understand how many

transactions per second not queries per second.

It's like Postgres doesn't track it and that's it so you don't

understand like?

You don't understand throughput.

You don't understand the average latency in general.

It's super hard.

And even at the highest level, like my whole database, how is

it doing?

How many TPS?

How many TPS is fine?

How many QPS?

Queries per second.

And what's about latencies?

But if you want to go down and understand which groups of queries,

parts of workload are responsible for, like, where do we spend

time, For example, or do some work
with the buffer pool.

To do that, you don't have it by
default.

You need to install the pg_stat_statements
extension.

And this is actually what most
people I know do.

This lesson is learned and pg_stat_statements
is maybe the most popular extension

because of that, right?

Definitely.

And we did have a couple of episodes
about pg_stat_statements.

It's hard to overstate that this
extension is not good.

It's really valuable.

And there are opinions that it
should be already...

It's time for it to go to the core
engine, because it sucks that

we install this extension and Postgres
needs to parse, analyze

query once again, just for metrics.

So it should be part of core.

But it's already a non-default
situation when you have a good

query analysis.

And it has problems, as we discussed
many times, it doesn't track

failing queries and so on.

It's another story.

But next step, we install pg_stat_statements,
we see all queries,

well, top 5,000 by default, right?

It has pg_stat_statements.max parameter,
which controls how many

normalized, aggregated, or official
languages, normalized queries,

queries without parameters, are
tracked.

But then, at some point, and actually
I think many folks don't

understand it.

Honestly, I didn't feel it so deeply
as I do it now, like last

couple of years maybe.

We track only part of our, of Postgres
work.

We track only execution.

We don't track planning.

And planning can be a huge headache
if workload is heavy and

machine is huge and so on.

So in heavily loaded projects,
not tracking planning means like

you don't...

You can, like, very roughly, You
can be in trouble in 50% of

production cases when you need
to investigate what's happening,

what's causing high CPU consumption,
for example.

But pg_stat_statements doesn't know,
it doesn't see it.

It doesn't see the planning phase
at all.

Only execution phase.

So it feels like very weird.

It's like we have, okay, we install
pg_stat_statements, we tell

all people who have LTP install
it, we know overhead is quite

low, and we will discuss it in
depth in a moment.

But then we say, okay, actually,
the pg_stat_statements extension

we actually installed, it's like
half of it.

Half of the extension, because
it's only about execution.

But planning is a super important
phase and we don't have it.

So all folks who have pg_stat_statements,
most of them, most

of them.

All folks who haven't changed settings
of pg_stat_statements,

they have only part of solution.

Michael: Yeah, I would probably
make the case that it's a bit

more than half.

Like I think planning in general
is like a much lower proportion

of performance issues are planning
related than execution related,

in my experience.

But I do take your point that it's
not it's not 0 percent planning

related so even if it's 80 20 or
90 10 it's still there's still

a whole category of issues that
we're not spotting by having

this off and I think there's a
couple of extra things like that

I don't think it's just heavily
loaded systems I've seen planning

dominated queries that are analytical
in nature where it's just

somebody trying to diagnose why
a single query is slow and that

has hidden as well if you're just
looking at things like...

Nikolay: This is just-in-time comment
actually.

Michael: Well, there's just-in-time
compilation as well.

But, yeah, good.

No, but sticking to planning, I
think there's a few confusing

things.

And I see this mistake made by
all sorts of quite expert people

as well when they're they might
be looking at a query a single

query plan and they say this query
plan only takes 100 milliseconds

but there's 10 milliseconds of
planning as well like they're

only looking at the execution time
they don't realize that the

planning is in addition to the
execution time, which they should

be saying is 110 or how they should
be summing those 2.

And the same is true in pg_stat_statements,
we should be summing

the 2.

And we can't sum the 2 if it's
off or we can, It's just it mistakenly

tells us that there's 0 planning
time where it's actually just

not being tracked

Nikolay: and Just a set once had
only metrics Call like total

time to mean time and so on before
Postgres 13 and since 13 it

was renamed it was renamed right
it never tracked planning right

Michael: there was no planning
it was only execution but the

naming made it clearer.

And on 13...

Nikolay: And 13 it was renamed
to total_exec_time,

min_exec_time and so on and there
is another group of metrics

total_plan_time and so on, min_plan_time
and so on.

And it means that, okay, we have
now the setting, but it's not

on by default.

And It means we have a second lesson
to learn.

Should we enable it?

We say install pg_stat_statements.

Don't go without it.

It's the number 1 recommendation
in terms of query analysis.

Always have pg_stat_statements settings.

I remember when it appeared, some rumors said, it adds 7% of

overhead, but it's worth it.

Well, as we can see now, it's actually very small overhead under

normal circumstances, but we had an episode showing, remember,

4 million TPS, showing that on edge case, overhead can be drastical

and drop performance.

It's like performance cliff.

But I doubt normal projects reach that point.

Everyone should be aware of that overhead, but again, it's a

cliff.

It doesn't come immediately.

It comes only when you reach a lot, a lot of queries per second

for a particular query ID, right?

Michael: Yes, and I think this suffers.

Tell me if you think I'm wrong I think the problem here is that

the easiest way to stress test Postgres or the easiest way to

do a quick benchmark is using pgbench yes which has this exact

Nikolay: it's not a lot of secrets immediately let's okay let's

talk about like we are like problem like it just says Edmonds

is number number 1 recommendation for query analysis.

Install it.

Everyone agrees?

Good.

There is some overhead.

We don't see it.

Like, it's not huge.

Okay.

Well, we saw it can be huge, but with some pathological workloads,

right?

And we will discuss it.

Now, question of this episode, basically.

Should everyone start saying, enable track_planning?

Should we, like, first default situation when Postgres doesn't

have anything and then we install extension, okay, everyone learned

this lesson.

It's kind of solved.

Should we consider the second lesson similarly and tell everyone

enable track_planning?

Because this gives you full power of producer statements now

Michael: Yeah my current opinion And I'll be happy to adjust

this as the episode goes on is I would enable this very early

in a project these days and Doing research for this.

I did come across some interesting things.

I didn't realize including why and when the default was made

off.

But yeah, I would say when you've not got high volume, high load,

turn it on while you've not got high load and then reassess if

you end up with a pathological

Nikolay: workload.

Which I

Michael: don't think you will.

Nikolay: I understand this approach
And usually it's good.

For example, if we had some overhead
which would grow monotonically

with...

Maybe linearly or somehow with
our workload.

I would understand this approach.

Because take these weights and
go to the gym with additional

weights all the time.

So you got used to it.

But in this case, it's very different.

We don't have any noticeable overhead,
as we saw from benchmarks.

We don't have it and it happens
only at very, very, very, very

extreme situations.

Let's talk about history.

I just learned about what you found
right before our recording

so I was super surprised.

Tell us more about it, very interesting.

Michael: Yeah so 1 thing, like
an easy thing I do in preparation

for most episodes is just check
when was the feature introduced,

what was the commit, like what
was the discussion around it And

I noticed it was in version 13
that it was introduced and I went

to the version 13 release notes
to see what was said, kind of

as a high-level summary.

And the release notes now have
these awesome little links to

the commits.

Even for old versions?

Well, 13 had it.

I knew they did it for the latest
version and I was surprised

to see them in 13 but very pleasantly
surprised so thank you

to whoever did that.

Like
Nikolay: a portal, that's great.

Yeah.

So every item in RezNodes we can
quickly trace discussion, commits

and so on, right?

That's great.

Michael: Yes, exactly.

Or you can go to the commits and
from the commits you can go

to the discussions.

In most cases.

Yeah, in most cases.

But often new features especially
simpler ones like this like

small like a new parameter they'll
only have 1 commit this 1

had 2 which which piqued my interest
immediately so I opened

them both and the first 1 was pretty
pretty normal it made sense

that this was added.

And then the second 1, it was when
I realized, oh it was during

the beta, it was during the beta
phase for 13, somebody had reported

a performance issue with having
track_planning on and turning

it off made the performance go
back to exactly how it was in

12.

So basically they had a regression
according to a synthetic benchmark

and then asked could we turn it
off and there was it was pretty

unanimous I think it got a few
replies and all the replies were

in favor of turning it off for
the version 13 release and as

far as I can tell it's not been
revisited since.

Nikolay: Yeah, so I now understand
much better what's happening

here.

Let me like unwrap it here.

So what happened?

It was obviously, it would be obviously
good if it was enabled

for all by default, right?

But then it was learned that there
is performance degradation

45%.

Okay, it doesn't matter, actually,
it can be 90% in some cases,

it depends on your machine and
so on.

And it was related to spin lock
contention in the results.

And this is exactly what we have
recently observed in our benchmarks

using our AI workflow which we
discussed also several times.

I wanted to say how I see our benchmark
workflow for synthetic

benchmarks, pgbench and so on.

We built a lot, and we collect
80 artifacts for each run, very

comprehensive configuration, everything
is stored, we can iterate.

And I see LLMs are just like kind
of oil so eventually we should

have well oiled machine right engine
and but the Main part is

not in LLM.

It's just it just makes it easier
to iterate So we wanted to

check overhead and we checked it
So pgbench which was also used

in like in this Research mentioned
in the mailing list pgbench

by default is not limiting TPS
But in normal circumstances under

normal circumstances in production,
of course we don't, so this

is not just load test, it's a kind
of load test which is called

stress test.

So we are on the edge, We are checking
what's happening on the

edge.

And in production we don't have
it normally.

You don't want to work at 100%
CPU usually in OLTP because you

will experience various interesting
phenomena like this basically,

right?

So spin lock.

And we checked it and like we took
very big machines, I think

192 cores with 5 generation, fifth
generation Google Cloud, still

spending Google Cloud credits,
right?

Fifth generation of Xeon scalable
Intel and almost 200 cores,

a lot of memory.

It doesn't matter because we also
took small scale, scale 100,

that means 10 million entries in
pgbench accounts only.

And we started, first we started
not select only, no difference.

Like if you use read-write workload,
you don't see difference.

But once you switch to select only
workload, you quickly observe.

So what we did, and like in our
show notes, we will have links

to detailed details, all details
with all numbers and reproducible

with all pictures and so on.

So we start with just 1 client
and 10, 20 and so on, 100 and

so on until 200 clients and we
expect that we should basically

grow in terms of TPS, right?

Because more clients, more TPS,
and Postgres these days scales

to many cores quite well.

But we quickly saw that we've enabled
on this machine, quite

powerful machine, right?

We've enabled pg_stat_statements.track_planning

Peak is reached after 30, between
30 and 40 clients.

Very early.

And then TPS goes down.

And without track_planning, similar
picture actually, but it's

later, until like 70-80, like 2
times to the right, 2 times more

clients and TPS also higher.

And once the peak is reached and
we go down, so first the lines

are together, but once the peak
is reached and we go down, we

see like kind of 2 times difference.

Peak is reached 2 times sooner
And TPS is 2 times slower.

Interesting, right?

But looking at this, I was thinking,
okay, why do we have peak

without track_planning reached
much sooner than 196 cores, I

think, vCPUs?

Because normally it should be there.

How many cores we have?

This is our maximum, most optimal
point of load.

There are nuances because we run
pgbench on the same machine,

so we limit number of threads by
25% of cores, so we couldn't

have all CPU consumed by clients,
basically.

So anyway, Then I recalled our
February tests and we had a podcast

episode about it.

pg_stat_statements overhead, 4 million TPS.

We needed to remove pg_stat_statements to reach
maximum workload.

I remember when we removed it,
peak shifted to normal position,

closer to number of vCPUs we have.

So what does it tell me?

Okay, I'm thinking, oh, it looks
like pg_stat_statements has some

significant overhead, but when
we enable track_planning, this

overhead doubles.

And then, since our automation
collects flame graphs, it collects

pg_wait_sampling analysis and
so on, so we quickly identify

there are spin locks indeed, but
the same spin locks are present

when you just use previous assessments.

This is the most interesting part,
right?

So on flame graphs we see, and
like, I think I should publish

a report.

If I have it already, please let's
have it in show notes, link

to the report.

So in Flame Graphs we see that
without track_planning enabled,

for this particular workload, we
have spin lock contention, and

there is a very wide s_lock
function inside pgss store.

pgss store is a function which
saves metrics.

So what's happening?

Michael: To pg_stat_statements,
that's the pass, right?

Nikolay: Yeah, yeah.

And like I would expect, like,
I remember in February I expected,

like, I wanted to quickly reach
1 million TPS and go further.

And I know it was achieved by Alexander
Korotkov in 2016, like

8 years ago.

So I was very surprised I couldn't
reach it easily.

And only when I removed pg_stat_statements,
I reached it.

So what's happening here is with
pg_stat_statements used, not

track_planning, just pg_stat_statements,
if you have such a weird workload,

like it's just a single query ID
and it runs hundreds of thousands

of TPS for this, QPS in this case,
queries per second, For this

particular machine, it can be lower.

Actually, we reproduce problems
on 8 core machines.

This is super interesting point
as well.

Yeah, yeah.

So this performance cliff can come
to you on small machines as

well.

Michael: But wait, wait, how many
concurrent connections?

I guess it comes sooner?

Nikolay: Yeah, well, it comes sooner.

I don't remember from the top of
my head, but I know this problem

can happen sooner.

So the idea is, if we have limited
resources and transactions,

these queries are so fast, like
I think it was like 15 microseconds

or so, it's super fast for primary
key lookup.

But they all fight to update metrics
in a single pg_stat_statements

record.

And spin lock is required for that.

That's why we see spin lock contention.

So it's observer effect as it is.

Pure observer effect.

And pg_stat_statements can have observer
effect, but you need to

reach a lot of queries per second.

Michael: Good point.

Nikolay: Because of limited resources,
right?

So, yeah, and this means that in
reality, very unlikely you will

see it.

Maybe, maybe, but unlikely.

It should be super fast index-only
scan, for example, and the

frequency is so high.

But When you look at flame graphs
with track_planning enabled

you see exactly 2 areas similar
width Both are as log inside

pgss store and 1 pgss store is
in execution phase and other pgss

store in planning phase So metrics
are saved separately and if

you enable they are saved 2 times.

That fully explains what's happening
here.

If we enable it we just move performance
cliff 2 times closer

to us.

Well if we sit in 0, right?

If we shift it already, it's...

Yeah, so...

Michael: Well it explains why in
your synthetic benchmarks you

got saturation twice as fast.

Nikolay: Yes, yes.

All pieces of the puzzle are like...

This took quite a lot.

We had 4 sessions of various kinds
of research starting in February

with pg_stat_statements and idea,
let's squeeze a lot of TPS from

our big machines.

But this is interesting, right?

Let's think about it.

So we have this pgss store.

Obviously, I think it should be
possible to find a way to store

it just once.

I mean, to save it once.

Michael: Well, I remember you suggested
that when we were discussing

privately, and since then, having
read the discussions, it's

a deliberate design decision to
save them separately.

And I didn't know about that.

So you mentioned a while back,
we don't even track queries that

fail for example.

Well now in a way we are like because
the number of times a query

was planned doesn't have to equal
the number of times it was

executed anymore in pg_stat_statements.

Once you've got track_planning
on you can see the number of times

it was planned versus the number
of times it was executed.

Nikolay: Are you saying that if
we have a failed quiz but it

was planned, planned time would
be saved?

Michael: Yes.

That's my understanding from reading
the...

I haven't tested this, but that's
my understanding from reading

the discussions around the design.

Nikolay: Still I think it should
be possible to optimize.

Yeah, I understand the design decision,
but if we think about...

Michael: Either the design would
need to change or the...

Yeah exactly but it's a trade-off
and that's like I found that

interesting.

Nikolay: I remember cases, terrible
cases which were very hard,

extremely hard to diagnose because
pg_stat_statements didn't show

like for example merge joins which
like merge join itself like

not merge join somehow like maybe
hash join was used, but considering

merge join, planar spent many seconds,
like 5 to 10 seconds,

just considering it.

Because it needs sometimes to check
the table actual data and

see like min or max values for
some columns, which is unexpected,

but planner sometimes does it.

And if it takes too long, planning
phase, even if merge join

is not chosen eventually.

In the end, we don't see it, right?

Log manager contention, right?

Sometimes with JSONB values, we
have something like planning

is long and consuming a lot of
CPU, but we should see it.

We should see it, right?

And pg_stat_kcache, which tracks
physical metrics and those people

who are, I don't know, like on
self-managed Postgres can use

it and it's great.

It also has track_planning actually
the same, same parameter,

right?

Michael: Oh no, off by default.

Nikolay: It's all by default.

Yes.

So now we understand the design
decision and we understand that

some benchmark, which checked very,
very like edge case, I would

say maybe even corner case because
you have unlimited TPS and

you have select only single query
ID it's edge it's corner case

not just edge we have 2 edges coming
here it's a corner case

so likelihood of having it in production
extremely low and decision

in my opinion decision was wrong
what do you think

Michael: yeah I think I agree I
can see why it was made I think

if you see it's quite a startling
number if you if you look at

45% drop that was the number reported
and shown.

I can see why people were scared
of that, especially when it

was in the beta phase.

But in hindsight, especially with
what I know now based on what

you've said, it seems like we'd
be much better off having it

on for everybody.

Nikolay: 45% is roughly 50, right?

It's like 2x.

Michael: It sounds scary though,
right?

Nikolay: It's reasonable.

If you are sitting on the edge,
you are suffering from this penalty

pg_stat_statements already gives
you.

You just double it, because pgss store
is called twice.

That's why it's roughly 50%.

But the conclusion here is not,
let's not enable by default.

Let's put a note in documentation
that pg_stat_statements is

sensitive to the cases when it's
a single query ID and a lot

of high frequent queries happening.

Michael: I think there is that
note, yeah.

Nikolay: If there is, it's good.

Then track_planning just doubles
this penalty and that's it,

but it's not happening under normal
circumstances.

Michael: It's a really good point
that it's not track_planning.

track_planning isn't the thing
causing this.

The track_planning makes it twice
as bad.

But twice as bad is quite a small
multiplication factor when

you're talking about such an extreme
case.

Nikolay: Yeah.

It would be like

Michael: if we imagine if we had
twice as you know that the transaction

ID wraparound is a couple of billion?

Or a couple of billion in the positive.

If it was 4000000000 in the positive,
that would make a bit of

difference, but not a huge, huge
difference.

In these extreme cases, twice isn't
that bad.

Nikolay: So I have 2 separate questions
here.

Can we have a single pgss store
instead of 2?

And second question, should we
recommend enabling it to everyone

but understanding that doubles
the problem of pg_stat_statements which

already is present, it's there
already, but you just don't feel

it.

You don't feel it, for example,
on that machine, almost 200 vCPUs.

You don't feel it unless a single
query ID has like 200, 000

calls per second.

200, 000.

It's insane.

This is the edge.

It's super far.

And without track_planning, okay,
it's 400, 000.

So not enabling it, you're just
saying, okay, instead of 200,

000, we will have 400, 000.

But looking at your statistics,
you see, okay, maximum calls

per second is what?

1, 000?

5, 000?

10, 000?

It's already a lot, right?

And usually we have different kinds
of problems with such queries

which are exactly present during
planning time because locking,

right?

It's already a problem.

Michael: Yeah, such a good point.

Nikolay: And I wonder, if we have
a high frequent query, a single

query ID like 10, 000, primary
key lookup in many cases, maybe

in most cases.

We say, okay, log manager can come
after you, right?

Let's cache it.

I don't know, like prepare statements,
or maybe plp.js.calc function,

indirect caching and so on.

Let's avoid planning.

Planning will be okay, right?

Michael: I haven't checked, but
that would be an interesting

test.

Actually, no.

Yeah, you could do the same test
again, but with prepared statements.

It would be interesting to see
if track_planning is updated at

all.

Presumably it's not doing planning
so it shouldn't do that.

Nikolay: That's a great point I'm
going to check it because like

as I said with our almost well
oiled machine I will just ask

to repeat experiment and with hyphen
capital M prepared we will

see and basically it should have
effect like we disable it but

we keep it enabled right

Michael: with a pool and maybe
like with some way of keeping

the sessions Remember last time
it was doing the first...

Nikolay: Yeah, I remember.

Michael: Would it need to be that
or not?

I actually can't remember.

Maybe that would be fine in this
case.

Nikolay: In this case it will be
fine because the problem is

like it's vice versa.

pgbench maintains connection here.

What we discussed is that like
in reality, if connection is not

maintained, we connect again and
then we don't have cache.

So the connection, okay, I will
double check this area.

It's good point.

And additional research to make
the picture complete.

But my gut tells me it should be
enabled in many cases, but we

should be aware of spin-lock contention,
which can happen if

we have high frequency.

Honestly, I'm thinking to find
some rule maybe to in monitoring,

for example, we know this machine
has this number of vCPUs, and

we can quickly check how many calls
per second, even if

track_planning is not enabled, we can
check it how many calls per second

we have for most frequent,

Michael: Very high frequency.

Order by calls, descending.

Nikolay: Order by calls, exactly.

It's not hard.

Yeah.

So top 10 by calls, and how big
is that?

And okay, this number of cores,
this number of QPS, this level

of QPS, just roughly estimate how
far we are from the performance

cliff.

Maybe it's possible to do this,
right?

For modern...

Of course, it depends on many things.

It doesn't depend on the plan,
I think.

Because saving happens only once.

It depends on duration, but not
on duration, on frequency.

And of course, on the resources,
type of processor you're using.

So I think it's possible to have
some kind of prediction, how

far we are from this.

And if we are really far, let's
enable track_planning.

This is my point.

Meanwhile, let's think maybe it's
possible to save it just once

and reconsider decision-making.

I'm really thankful you made this
research.

I liked it in my puzzle.

Like I still collecting pieces
of it.

Cool.

So enable it, but it will double
overhead and be aware of overhead.

This is the bottom line from pg_stat_statements.

Michael: Yeah.

If you ran a managed Postgres service,
would you enable it for

everybody by default at the beginning?

Nikolay: Well, yeah, this is another
piece of great info you

found.

We almost forgot to discuss it.

You found the...

So you found that many...

Well, it's not surprise to me.

You sent me the list of managed
service providers and what's

possible what's not there it's
not surprise for me that most

have it off by default surprise
for me it was that Google Cloud

even doesn't allow to change it

Michael: not just Google Cloud
quite a few of the providers...

Nikolay: Crunchy Bridge right?

Michael: So yeah yeah I is it's
definitely not a it's definitely

not a full list, and I'm guilty
of testing more of the ones that

make it easy to spin up new instances.

Nikolay: But let's drop some names
here.

RDS or by default?

Michael: Should we go the other
way around?

Is the other direction is the ones
that do make it configurable.

RDS is I think still by far the
most popular.

Off by default but configurable.

Bear in mind pg_stat_statements in
RDS is on by default.

So they do change some of

Nikolay: the defaults

Michael: around this, for example.

Most of these provide, in fact,
I can't remember the last time

I checked a provider that didn't
have pg_stat_statements on by

default.

You mean

Nikolay: on by default, you mean
it's in shared preload libraries,

or it's created in like template
database?

Michael: It's, yeah, it's, it starts
tracking queries without

you doing anything.

You don't have to create extension
or whatever the normal thing

is to do.

Nikolay: I don't remember should
we create extension.

Michael: Almost all providers,
because they rely on it for their

monitor, like the monitoring they
present you runs off

pg_stat_statements for most of these.

So it's on by default, but
track_planning isn't.

That's true for almost all of the
ones I tested on.

But I think Timescale...

Nikolay: Timescale is champion
here.

Michael: Yeah.

Well, it's configurable but it's
also on by default which was,

I think, the only one that I found
that had it on by default.

There probably are others I didn't
check them all.

It's quite time consuming.

Nikolay: That's great.

I love this decision.

Michael: Yeah, it's great.

Nikolay: But a

Michael: few of the others made
it configurable, which is nice.

I just was surprised that at least
one of the major ones doesn't

even.

Nikolay: Yeah, Google CloudSQL
even doesn't allow to change

it and we...

You gave up trying to check Azure.

Michael: Yeah, I had limited time
and the UI and I don't...

I need to talk to somebody there
to teach me how to use it or

something because I was really
struggling.

Nikolay: Yeah, I understand that.

So, yeah, so who can like, our
recommendation is to for those

who are listening from those teams
like CloudSQL, Crunchy and

Supabase guys like make it configurable.

It's not normal that it's not.

I would say consider making it
on by default, but at least bare

minimum should be let users to
decide.

Michael: I think so, and I think
normally when these providers

change a default they'll change
it for new instances only where

I think there's just so such low
risk so I like that approach

of you know if people create new
clusters or new instances, have

it on by default.

It will help people when they come
across planning time related

issues, diagnose those issues much
quicker, much easier, whether

that's on their own or once they
pull people in.

Nikolay: Yeah yeah yeah that's
that's that's for sure yeah so

good I hope we brought some food
for thoughts to guys who are

listening to us.

So yeah, thank you.

Michael: Nice one Nikolay.

Nikolay: Yeah, thank you.

Michael: Take care, have a good
week, bye.