Postgres FM

Nikolay and Michael discuss some Postgres Gotchas, things you might expect to work one way in fact working another way.

Here are some links to things they mentioned:

Our episode on NULLs https://postgres.fm/episodes/nulls-the-good-the-bad-the-ugly-and-the-unknown
Postgres Gotchas (list by Ian Barwick) https://sql-info.de/postgresql/postgres-gotchas.html
Our episode on slow count https://postgres.fm/episodes/slow-count
Discussion on X about major version upgrades and statistics https://x.com/samokhvalov/status/1844593601638260850
Our episode on upgrades https://postgres.fm/episodes/upgrades
Statistics Import and Export (commitfest entry which link to email thread) https://commitfest.postgresql.org/50/4538/
vacuumdb https://www.postgresql.org/docs/current/app-vacuumdb.html
vacuum_cost_delay https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-VACUUM-COST-DELAY
Zero-downtime Postgres schema migrations need lock_timeout and retries (blog post by Nikolay) https://postgres.ai/blog/20210923-zero-downtime-postgres-schema-migrations-lock-timeout-and-retries
Postgres Hacking 101: adding params, prevent_unqualified_deletes and prevent_unqualified_updates https://www.youtube.com/watch?v=samLkrC5xQA
PostgREST suggestion to use pg-safeupdate https://docs.postgrest.org/en/v12/integrations/pg-safeupdate.html#pg-safeupdate
DBOS (new company co-founded by Michael Stonebraker) https://www.dbos.dev

~~~

What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!

~~~

Postgres FM is produced by:

Michael Christofides, founder of pgMustard
Nikolay Samokhvalov, founder of Postgres.ai

With special thanks to:

Jessie Draws for the elephant artwork

Creators and Guests

Host

Michael Christofides

Founder of pgMustard

Host

Nikolay Samokhvalov

Founder of Postgres AI

What is Postgres FM?

A weekly podcast about all things PostgreSQL

Michael: Hello and welcome to PostgresFM, a weekly show about

all things PostgreSQL.

I am Michael, founder of pgMustard.

This is Nikolay, founder of Postgres.AI.

Hello Nikolay.

Nikolay: Hi Michael, how are you?

Michael: I'm good, how are you?

Nikolay: Very good.

Let's talk about some crazy stuff.

Michael: I chose the topic this week and it was a listener suggestion,

it's Postgres gotchas and we've had a short discussion before

this call about what we should even count as gotchas.

So how do you define it, what do you think of when you hear Postgres

gotcha?

Nikolay: Well if you for example run, I don't know, like SELECT

5/2, what will you get?

Michael: Is this like data type casting?

Nikolay: Right right well it's not Postgres related I think it's

SQL related but still what do you expect to get?

Normal person would expect 2.5, right?

But you will get, and I just double checked because I always...

I'm always like thinking, am I right?

So you're getting 2, so 5.

Michael: 0 wow, I actually thought it was going to be 3.

Nikolay: We probably have 2 things to discuss here.

First is of course integer and integer, right?

Results should be integer, so there's no decimal, there's no

point here, there's no dot.

And, but 2 or 3?

Rounding or flooring, right?

Michael: Yeah, because like rounding conventions at least where

I grew up we generally round up from 0.5 but not clearly not

here anyway I like this almost this definition which is something

where you may expect 1 outcome but a different outcome happens.

That's

Nikolay: unexpected behavior basically in other words.

Michael: Yeah, especially unexpected if you haven't read the

documentation.

Nikolay: Sometimes documentation also wrong or misleading or

sometimes it's just confusing.

Michael: Yeah fair enough.

I just wanted an excuse to list a few things that are made very

clear in the documentation, but I know people get confused by

it because often they don't read that.

Nikolay: You know, I'm curious, I actually found a good example,

right?

It's like probably a gotcha SQL, not Postgres, because

I think it's maybe defined by standard, but, and this is super

basic, but I'm wondering why indeed it's 2 not 3 because if you

for example I what I did I double check of course I checked floor

and round functions and of course
as expected floor I converted

5 to numeric or just just to

Michael: last with the colon colon

Nikolay: you're right right so
the result is also numeric and

then we have of course 2.500 like
so but then I checked floor

gives me 2 round gives me 3 is
expected because 2.5 means like

we need if you're rounding according
to their names for these

rules, I don't remember, right?

But usually 5 you go up, 4 you
go down, right?

And it gives 3, Round gives 3.

But I was thinking, okay, why 5
slash 2 integer and integer gives

2 and not 3?

Maybe because it uses 4.

We could check the source code,
of course, but the next thing

I checked is the final thing I
checked.

5 converted to numeric slash 2
gives you 2.5.

And then you convert back to int,
right?

Again, double colon int.

It gives 3.

Amazing.

So how come we have 2 if we don't
use type conversion, explicit

type conversion at all?

Any ideas?

It was good gotcha to find.

I always knew that it's 2 because
I demonstrated.

I just didn't think like why.

Now I think why.

I have questions.

Maybe there is some good explanation,
simple, but I can see easily

that it can be gotcha.

It can be considered gotcha.

And let's agree on 1 more thing.

We don't talk about nulls today.

Michael: I actually did have them
on my list, but not to go into

detail.

Nikolay: We had a whole episode,
and this whole episode is telling

like, nulls are surrounded by gotchas
both like standard defined

standard produce and like global
SQL gotchas and local Postgres

related gotchas as well for example
when you go to a race right

Let's just exclude it because it's
a whole massive area.

Michael: I think that's very sensible.

And if you haven't listened to
that episode, I'll link it up

in the show notes.

Nikolay: Consider it to be fully
included here.

Yeah.

Right.

Okay, let's maybe switch to your
ideas.

I know you have a list, pull something
from it.

Michael: Yeah, I've got a few that
I wanted to make sure we mentioned.

I'm gonna start with some ones
I consider more beginner level

gotchas and then maybe we can work
our way up a little bit.

The first 1 I had is 1 I see all
the time in kind of forums every

now and again on Reddit, quite
a few times in like the Postgres

Slack, a few up on a few mailing
lists.

People getting confused or annoyed
when they name objects in

Postgres and they use mixed case,
so maybe camel case or some

way of naming a table like for
example like a two-worded name

and using capital letters for the
2 names and no space in between

them that kind of thing and then
realizing the table is not called

that behind the scenes and it's
all lowercase and if they've

used an ORM or something to name
these at the ORM often puts

quotes around those case names
so then they become cased in the

database.

So I think this whole area is like
a 1 big gotcha for beginners

and the people who set up the schema
in the first place, if they

do something like this, if they
don't just use lowercase object

names, can really make life a bit
painful for everybody that

comes after them having to double
quote everything.

Nikolay: Yeah, especially Java
people like it.

Michael: Oh, really?

Nikolay: Yeah, I noticed.

Usually, if we have a new customer
and we start discussing their

database, we see some camel-stack-cased
table names.

And I think, I'm already thinking,
are you Java guys?

Yeah, there are.

Usually, not always, but usually
so.

And my previous startup was using
it camel-style.

And we just decided to keep it
as is because it was inherited

from some...

I think it was in MySQL originally.

Like, we started very lean, so
we borrowed some schema from some

open source project.

And it had it.

And I just, knowing this very well,
I still decided to keep it.

But the problem with double quotes
is not like...

It's not a big deal, but when...

You know what comes with quotes
usually, right?

Backslashes.

Michael: Oh, having...

Yeah, having to escape them.

Nikolay: Yeah, and you cannot escape
from this.

You need to use it to escape.

Right?

I mean, yeah.

And backslashes, if you have automation
in many languages, like

sometimes it's nightmare.

Right now we have an ongoing problem
with backslashes and escaping

of quotes and double quotes in
our AI system.

It's a combination of Python and
Bash actually.

And yeah, it drives us crazy sometimes.

We solved some problems, but not
all of them still.

And when Yi needs to talk about
your table names and you give

the schema with double quotes,
and then you want to experiment

in DBLab to find some plans or
something, it sometimes is broken

just because of problems with backslashes
and passing these pieces

between different components.

I would prefer keeping everything
lowercase, that's it.

In post-process.

Right?

Snake style, right?

Michael: Yeah.

Nikolay: And that's it.

And just forget about it.

Michael: Yeah, plays.

Pay for plays.

Nikolay: I wanted to mention the
list of gotchas I saw actually

this is like as we discussed right
before this recording I didn't

know this the word gotcha until
2006 or 2007 and this is exactly

when this list was published.

And I remember we had 2 lists,
1 for Postgres and 1 for MySQL.

For MySQL, it's outside of scope,
right?

Again, even more than nulls.

But this list, I'm looking at it
right now and I'm very first

of all it's survived how many years
like it's insane 18 17 years.

Wow.

And I'm looking at this, it's quite
short list at SQLinfo.de,

right?

And almost everything is crossed
out.

Good.

Michael: Yeah, this I've never
seen this page before.

This is really cool.

And yeah, maybe more than half
of them have been cost

down.

Nikolay: Yeah, account start is very slow.

Let's just laugh at it.

Michael: Oh, and we had an episode
on that, right?

Nikolay: I think we should.

Unicode means UTF-8 left as well.

Well, yeah.

Yeah, constraint checking is interesting,
but maybe let's not

dive into it.

And finally, I wanted to highlight
what else is left.

Only 4 items.

5, Okay, 5.

But 1 of them is upgrades, upgrades,
related stuff.

And I was discussing on Twitter,
on X, I was discussing this

topic over the last few days.

I just checked documentation of
3 major managed Postgres providers.

So RDS, Cloud SQL, and Azure Postgres.

I saw that all of them say, okay,
upgrade is done, major upgrade.

Now it's time, don't forget, now it's time to run ANALYZE.

And they say ANALYZE like they give you a snippet, ANALYZE semicolon,

which means it's single-threaded as well.

First of all, they don't include it.

None of them include it into automation.

But we don't talk about managed Postgres providers today, we

talk about Postgres.

My question is why pg_upgrade does not include...

It prints only some text in there, but who reads the text?

Many people will say, for those who don't read, it's on their

shoulders, it's their problem.

But I cannot agree with it.

Michael: I could even make a case for the gotcha here is that

statistics are not preserved on major upgrade.

I think if you asked somebody that didn't know whether they would

expect the statistics to be wiped or to be preserved, somebody

maybe doesn't understand how difficult it might be or like the

catalog, like the reasons why they're not preserved, you'd be

forgiven to think like in a in the most advanced open source

database that that might be that might happen even though it's

complicated and I think there has been some work.

It should

Nikolay: happen and Everyone agrees on it, just not solved.

Michael: Yeah, so I think that's the main gotcha.

And then, obviously, the solution at the moment is we need to

want to analyze.

Nikolay: Yeah, let's unwrap a little bit.

So we talk about statistics, which is required for a planner

to make right decisions.

If the database is not trivial, like containing like 100 rows

in each table, that's it.

If it's not trivial, we need statistics.

Without statistics, database is slow.

And what happens, I think in a zillion cases, it happens.

Like It's hard to estimate, but when I posted this tweet, I received

several replies immediately that, oh, we got beaten by this badly.

Some person wrote, we learned this hard way, obviously.

And this is like what happens all the time.

And recently, during summer, we also like failed to help a customer

with it.

Like we discussed major upgrades, but we...

From now on, I'm pointing out, like, don't forget, don't forget,

don't forget.

But we discussed upgrades, we discussed some, like, complicated

stuff, and then they went to upgrade, and next Monday they say,

oh, today they are in Europe.

So in the morning, it was already by the evening in Europe.

And they say, we had nasty database incidents today because guess

what?

We forgot to run ANALYZE.

And this is huge gotcha not only
by managed services, they could

do it but I don't care too much
about them.

This is a gotcha of pg_upgrade.

I think it's very well understood,
there is work in progress

for export and import of statistics,
like quite a long thread

in hackers already and commitfest
entry.

It's still not done, but there
is hope.

There is 1 more thing here.

So I think pg_upgrade definitely
should have it.

And also pg_restore, but let's return
to it in a few minutes.

So when someone says you need to
recalculate statistics yourself

after upgrade, pg_upgrade.

I definitely would expect pg_upgrade
to do this for sure.

But okay, it just prints do it.

pg_upgrade prints vacuumdb, I think,
no?

Or in stages, or like vacuumdb
in stages.

It has this vacuumdb analyze only
in stages.

2 options.

Analyze only means no vacuum and
just Analyze.

In stages means first it gets only
1 bucket for each table, for

each column actually, right?

And then some midway until your...

And then default_statistics_target
or individual column based

setting for each column.

default_statistics_target by default
is 100, 100 buckets.

This is what defines how much Postgres
planner will know about

distribution in each column, right?

And it can be adjusted globally
or for individual columns.

But...

Michael: Almost like a sample size,
like the sample size increases

of what it samples.

Nikolay: Yeah.

And I quickly realized this is
not what you want to do in OLTP,

because in OLTP, you either include
Analyze inside maintenance

window.

In this case, there is no sense
to run it 3 times, or you just...

Or that's it, there is only option
actually.

Like, I don't trust in opening
gates with weak statistics.

It will be like unpredictable again
incident.

Michael: Yeah.

Yeah, risky.

But this feels like a topic that
might be worth a deeper dive.

Nikolay: Right.

Right.

I just wanted to point out that
when people say Analyze, it's

just 1 threaded, right?

And you expect Postgres has a lot
of parallelism implemented.

But when you run Analyze, it's
always single threaded.

This is unexpected as well, right?

Michael: Good 1.

Yeah, I like that.

Nikolay: It's kind of gotcha inside
gotcha we have matryoshka

style gotcha here right Russian
doll style

Michael: what do you do you recommend
doing like just kicking

off a bunch of queries analyzed
tape like per table

Nikolay: I recommend partition
I recommend vacuumdb

Michael: nice okay

Nikolay: And speed up and go full
speed inside maintenance window.

Again, it's a different story.

But this vacuumdb, it has
--jobs

or -j.

You can define.

And usually, we take as many courses
we have on the server and

go full speed inside maintenance
window to achieve.

Analyze should be quite fast if
you don't do vacuum.

There is also gotcha a little bit
in vacuumdb options because

vacuumdb's, --analyze
will also do vacuum.

So there's

Michael: also analyze only or

Nikolay: there is analyze only.

Yeah, but it's easy to overlook.

There's 1 more gotcha here inside
all this.

When people go with manual single-threaded
analyze or maybe multi-threaded

but just explicit analyze.

I've like quickly, it was interesting,
I was thinking someone

said analyze can be super expensive.

I agree, but even if it's like,
if the default_statistics_target

is 1000, it's not that expensive.

But back to this, can we throttle
it?

For example, you want it to go
very slow, because autovacuum

has autovacuum, doesn't have actually

Michael: a vacuum_cost_delay also?

Nikolay: Yes, there is
vacuum_cost_delay and vacuum_cost_limit.

By default, it's off because I
think the cost limit is 0, meaning

that it's not checked at all.

Not cost limit, cost delay is 0,
like no check.

But autovacuum has like mirrored
pair of these settings.

It's throttled.

Long story short, autovacuum is
throttled usually.

And we actually usually fight with
our customers a little bit

to increase this, make it even
more aggressive.

And managed providers already did
part of that job, which is

good.

But if we run it manually, it's
not throttled.

It will go full speed.

I doubt you will need to throttle
it, but some people want, okay,

let's throttle it.

Question, how?

We have a vacuum limit, vacuum_cost_limit, vacuum_cost_delay.

We can set vacuum_cost_delay from default 0 to some point.

But is it about Analyze, or it's only about vacuum?

Based on just the naming, it's not about Analyze.

It's about vacuum only.

Michael: Right?

Yeah, that would be my guess.

But by the line of your questioning, I'm guessing it does affect

your analysis.

Nikolay: Right.

I go to documentation and I see vacuum_cost_limit, vacuum_cost_delay,

I think.

And I see description.

They talk only about vacuum, they don't talk about analysis.

So I make conclusion, it's only about vacuum.

But some part of my very long-term memory tells me, is it really

so?

And then Sadek Dusty, who helped me a lot with, and still helping

sometimes when I write at how-to's and reviews How-To's, I'm

very thankful for that.

I have already 94 how-tos.

So I wrote how-to, how-to Analyze.

And right now I need to add this.

vacuum_cost_limit, vacuum_cost_delay, they affect Analyze.

If you scroll up to the beginning of section, you will see that

it talks about both vacuum and Analyze.

And we also, using AI, I checked source code and quickly found

that there is a function called vacuumExec, which works with

both vacuum and Analyze.

And this confusion comes from source code, propagates to the

documentation, and then to our heads, right?

This is gotcha as well.

So you can use vacuum_cost_limit, vacuum_cost_delay, to throttle

Analyze, although I usually don't do it because I think we need

to go full speed.

Michael: And in the 1 example we're talking about here, we want

to go faster, not slower.

Nikolay: Right, right.

But it's interesting, right?

So there is some small confusion.

I think there are many such confusions.

But they are slowly fixed.

They are slowly being fixed.

Michael: Great example of 1 where the documentation

wasn't helpful in avoiding that 1.

So yeah, nice.

I've got a few more beginner-friendly ones that maybe we could

rattle through.

I'm not even sure you'd consider them all gotchas, but 1 relates

of places.

I think that not only catches...

Well, you don't think it's a gotcha?

Nikolay: I think we should exclude
this on the same, like, we're

using the same approach as now
because it's a huge bag of, yeah.

Michael: Cool.

And then another 1 that I see catch
out, well, a couple that

I see catch out beginners, again,
maybe this is another huge

topic, but access exclusive locks
on like just basically DDL

blocking selects, like blocking
every, blocking inserts and like

you know all sorts of people getting,
I think people don't realize

because if they're working on small
databases at first, they

start to grow in confidence of
how to do things.

And then it only catches them out
once the startup's got really

successful and the table's much
bigger, that kind of thing.

Nikolay: Yeah, I think there is
some memes with Gauss distribution,

or how it's called, right?

Gaussian distribution.

And first, I think schema changes
are painful.

In the, like, schema changes are
super painful.

They require expertise, like, beginner,
right?

Then in the middle, you can imagine,
oh, Postgres has transactional

DDL, everything is fine, we just
need to wrap everything into

1 transaction and, and rely on
Postgres great capabilities of

transactional DDL.

Michael: Yeah, or just concurrently
and think

Nikolay: yeah Yeah, MVCC works.

Well, it doesn't block sell Excel
X are always working you can

you can have a lot of Like exciting
stuff in the middle and then

on the on the right you say post
this doesn't look was this has

come Transactional detail, but
you cannot use it

Michael: Well the in on the right
has to be the same as the left,

doesn't it?

Schema changes are painful.

Nikolay: Yes, they are painful,
transactional DDL cannot be used

in many, many cases, and you can
shoot your feet very well many

times.

So this is it, I think.

And a lot of gotchas there.

Michael: Another 1 that's not necessarily,
I'm not sure if this

counts as a gotcha, I'd be interested
in your opinion but it

does have seen it catch people
out and I guess it's a little

bit of a foot gun and that's you
know you see the memes about

this as well actually DELETE or
UPDATE without a WHERE clause

and then just suddenly seeing you
know instead of 5 updated.

But if you think like it's not
Postgres, why do you say it's

not Postgres out of interest?

Nikolay: Because the SQL standard
does it like defines this and

any relational database which follows
SQL standard to some extent

has it.

Michael: Yeah.

But I've seen clients catch these,
But the ones that ship, well,

the only, the psql doesn't.

So I don't think it's unreasonable
that Postgres does it this

way, but I've definitely seen it
catch people out.

Nikolay: I agree.

Blame SQL standard.

And there is extension to forbid
it.

And also it was maybe 1 of the
very first hacking sessions with

Andrei.

We implemented some setting, GUC
setting in Postgres, which would

allow administrators to, I don't
think error, warning, like completely

forbid or just warn, I don't remember.

I think error should be here.

But it was not accepted, of course,
because I think it was discussed

many times.

I mean, it's possible.

And right now, as I see, I think
PostgREST does it, right?

PostgREST...

Michael: Oh, does it?

Nikolay: So usually right now it's
sold, if there is some middleware,

it's sold there.

So if you develop API using PostgREST,
for example, I think, Maybe

I'm mistaken, but I think there
is a way.

If there is no way, it's a good
idea to implement, actually.

Michael: Yeah, it would be, yeah.

Nikolay: Yeah, so if in URL of
your HTTP request of RESTful API,

there is a command to run delete
without any filters, I think

it will not do it it will not go
to progress with it so but inside

database I don't I don't think
it's going to change right and

we tried actually so

Michael: yes with the well and
we tried what do you mean each

right again

Nikolay: we had a session yeah
we had a patch and they submitted

it, but it was a small discussion
and chances are close to 0

that it will be successful.

Michael: Patch in psql or somewhere
else?

Nikolay: No, not in psql.

We wanted to have a Postgres
setting, GUC setting, to just

allow administrators to forbid
unspecified, I don't remember

how we called it, maybe unspecified
delete and update, when you

don't have the WHERE clause at
all.

Just like,

Michael: cool,

Nikolay: don't do it.

There's a truncate for if you want
to hit right an update of

whole table I don't know yeah but
again like right now it's in

middleware usually sold in middle
where You can have triggers

or something to protect yourself
if you want.

Or this extension if you're on
self-managed Postgres.

There is some extension for it.

I wanted to mention 1 thing, step
back to DDL.

My favorite example is, do you
think just adding a Column, you

cannot put your traffic down or
your Database down?

You can, if it cannot acquire Lock.

It starts blocking all Queries
to this Table, even SELECTs.

Like, I cannot acquire Lock, everyone
else can wait.

Even SELECTs who come with access
Share Lock, don't touch this.

That's it.

And that's not good.

And this is super unexpected and
a super big surprise.

Michael: Yeah, especially because
you could even test that.

You could even make a fork of production
or clone it run that

on its own in isolation where there
is no Lock and it runs super

fast so you think this is safe
but even though it was super fast

it does require that heavy Lock
and that blocks everything else

definitely a gotcha that's a great

Nikolay: yeah yeah and there are
many others in the area of DDL

but this I find like affecting
every single project which had

like at least some growth like
I don't like to 1000 TPS and it's

already noticeable it's not noticeable
in tiny projects

Michael: yeah yeah which I think
is part is kind of part of the

problem right or like at least
makes it worse because anybody

that gains experience on smaller
projects as they grow Get some

confidence that these things work
or they're fine and then it

change changes at some point

Nikolay: exactly

Michael: so you've got a great
article in this actually I think

that I can share about DDL it's
all about DDL changes I think

Nikolay: yeah it's yeah yeah it's
this this 1 is definitely included

I think this 1 like has a different
blog post about like I have

few about Details okay what else

Michael: well in that in that area
of making big changes partly

by mistake you know that the DELETE
or UPDATE without WHERE Do

you think people would expect to
be able to undo something?

I'm not thinking just from people
with experience of relational

databases, but the fact that if
you do something like that by

mistake, that you have to go to
your backups to get that data

back.

If I was in my early 20s or just
out, you know, if I was in my

teens, all other software I use
has the ability to undo something

if I do it by mistake.

So it does feel slightly unexpected,
potentially if I've grown

up in that world, that I can make
1 mistake and not be able to

undo it.

Nikolay: And what Database you
can?

I mean, of course, there are like...

Michael: Yeah, I don't know.

Nikolay: Time travel or...

If branching is supported, you
can...

But it's still like not super simple, right?

Well, yeah, I don't know.

I mean, there are several projects which try to solve this problem

and I mean the data specific databases supporting branching for

example or even other I think Have you heard about this new company

where Michael Stonebraker is participating?

Michael: I've

heard of it, I don't know much about it.

Nikolay: DBOps?

Or how is it called?

I don't remember.

Not DBOps, DBOS.

Michael: Yes.

Nikolay: So I wonder if this feature will be inside it.

I don't know.

For me, it's not expected.

I cannot name this gotcha because it's a hard problem, and we

didn't get used to it, to having it, because it's rare.

But if we go to young people's minds, there are many gotchas

for them, right?

Oh, this is working not as I expected.

So naming is hard just because it should avoid gotchas.

If you chose some name, you should think about other people and

think to be very predictable.

This is the art of naming.

And we just discussed this vacuum_cost_limit affecting analyze.

So yeah, and in young minds, gotchas, because gotchas world is

much bigger because they don't understand the logic in many places,

how it's like, like common practice in many places, right?

And that's why many more unexpected situations.

Michael: Yeah, fair.

Nikolay: Yeah, by the way, if you run pg_dump on a very large

database, like 1 table after another, And then we deal with snapshot,

right?

With snapshot, it's a repeatable read and snapshot.

And even if you use multiple workers for pg_dump, they are synchronized

reading from the same snapshot.

But what if some table is dropped?

Will it be present in this table?

I'm not sure, actually.

I just think there is a God check

Michael: here.

Nikolay: If during dump there are some schema changes, can we

have some inconsistency in backups?

Not in backups, in dumps.

By the way, this confusion was resolved recently, right?

We didn't mention it in the previous discussions of confusions,

but the pg_dump page is not saying it's for dumps anymore.

So dumps are not backups.

Michael: That's coming.

Nikolay: Yeah, in 2018.

Okay, okay, okay.

We will discuss this additionally.

But this is a big confusion when
the computation says pg_dump

is a backup tool.

It's still so, even in 16.

And now we have all rights to say
it's like all people who said

pg_dump is not a backup tool, we
are right, because finally in

18 it's already changed.

I think it won't be reverted, I
hope so, this patch.

But back to our topic, pg_dump can
be expected to be a tool for

backups just because of documentation.

Then you have quite a painful process
of realizing it's very

bad for backups.

So it's a gotcha, which is already
fixed in the future in Postgres

18.

And if we, in this area, pg_restore
doesn't run analyze, which

hurts every second attempt to restore
a big database, even with

experienced people like I am.

Because, again, it's like with
pg_upgrade, we discussed it.

pg_restore, You restore a huge
dump, you start working with

it, but it's slow, and in plans
you see where you expect a simple

index scan, you see bitmap scans,
bitmap index, bitmap heap scan,

because there is no vacuum, there
is no analysis.

It also should run on a vacuum
to collect visibility maps.

Michael: it might not be unreasonable
to think that maybe backups

would include the statistics and
then restore would restore the

statistics.

Nikolay: Physical backups, I agree.

Logical backups, I disagree.

You don't include indexes, which
is derivative from data itself.

Statistics is also derivative of
data.

So you include description, like
create index.

There should be, maybe should be
some word like analyze in there.

I wanted to say create statistics,
but it was analyzed already.

There's such word.

Why restore?

And restore could run it.

If we use pg_restore -j
8, for example, 8 parallel drops.

In this case, why are we not recalculating
statistics?

Right now, everyone who deals with
some automation, for example,

for DBLab Engine, we need it when
provisioning is logical.

So it's dump restore and we need
to move as fast as possible

multiple jobs, multiple workers
for dump, multiple workers for

restore.

And then of course, vacuumdb,
in this case, --analyze

to have vacuum as well and also
-j, but People forget about

this and forget forgetting means
like you expect it will be fine,

but if you don't run analyze yourself
ideally in multiple threads

and multiple workers using multiple
workers in this case performance

will be very bad and people start
thinking oh Postgres is slow

have you heard about this Like
I restore from dump and it's slow.

Michael: I have to admit I haven't
heard that complaint, no.

Nikolay: Okay, I'm hearing it and
I'm doing it myself almost

20 years.

I know about this.

Michael: No, I'm not saying it
doesn't exist, I just haven't,

you know, maybe it's just not the
category of problem maybe what

happens when people experience
that is they don't noticed online

or I didn't know it's or they don't
post they just go try something

else

Nikolay: yeah it's it's not a big
deal if you restore, and it's

working, and then you want to deal
with it 1 day later.

Because probably, likely, autovacuum
will do it.

But If you expect to run fast queries
immediately after restoration

from a dump, I'm talking about
something which is 100% very well

known to everyone, who is at least
a little bit experienced.

But it doesn't make it non-gotcha,
it's still gotcha.

That pg_restore doesn't calculate
statistics and doesn't run Vacuum.

Michael: I mean it's quite remarkable,
I know we could go on

with this list for quite a while
and there's like varying levels

of gotchas but I like that I like
the list you sent me how how

short and brief it is and I I really
think for many other databases

it would be a much much longer
list I listened to well you're

listening to your experience of
you know a couple of days of

trying to use MySQL at the beginning,
and recently I was listening

to an interview with Tom Lane,
and he mentioned, like, checking

out MySQL at the beginning and
not liking a few things about

it.

And that feels like a really common
experience.

And having worked with commercial
databases like SQL Server and

Oracle, that often move much faster
in terms of adding features

that enterprise clients really
want.

There's just so many more gotchas
that result from processes

like that.

So I imagine I could be wrong.

Nikolay: I think Postgres has lots
of gotchas, lots of them.

Michael: But compared to other
relational databases?

Nikolay: Yeah, it's a complex system,
all complex systems have

a lot of gotchas.

Some of them are just very niche,
very narrow and deep and so

on.

Some of them are like nobody knows
about them.

For example, SELECT INTO.

This is what you can write inside
PL/pgSQL, right?

So you write a FUNCTION or TRIGGER,
and you write SELECT INTO

some variable, blah, blah, blah.

And then you think, OK, this SELECT
is like maybe it's complex

SELECT, actually.

Subselect or, I don't know, joins
and so on.

But in the end, you have a scalar
or something, and you select

into a variable inside PL/pgSQL.

And then you think, I want to understand
the plan, or I want

to just run it and try to troubleshoot,
and you put it to psql,

what will happen?

If you put SELECT INTO blah blah
blah, and then some regular

select from where order by you
know

Michael: just an error no what
happens

Nikolay: it will create a table

Michael: oh okay yeah

Nikolay: go select into is 1 of
your lights it's great table

This I think just should be removed,
like completely removed,

deprecated.

Select into creating a table, it's
a DDL.

This is definitely a gotcha for
everyone.

Michael: Yes, imagine all those
parsers that just...

I know this is flawed for multiple
reasons, but look at any query

that starts with SELECT, and like,
oh, it's a read query.

Nikolay: Well, yeah, usually we
say SELECT for update is acquiring

a heavy lock, blocking all writes.

Yes, blocking all writes on these
rows, particular rows, not

the whole table.

But Select can create a table.

And we have create table as select
already.

And this is what all people use,
I think, who need it.

So select into probably just some
very old stuff that could be

potentially removed, I think.

Michael: If it's not in the SQL
standard

Nikolay: oh it's a good point if
it is no chances right

Michael: yeah well probably not
no cool anything else you want

to make sure we covered

Nikolay: maybe it's enough I think
we covered less than 1% of

what we could.

This is my feeling.

What do you think?

Michael: Well, it depends what
you count as gotchas and how niche

you're willing to go.

It turns out the more you know,
the more gotchas you know about,

I guess.

Nikolay: So I could not resist
and I checked the documentation.

SQL standard uses SELECT INTO to
represent selecting values into

a scalar variable of a host program
rather than creating a new

table.

It's definitely Postgresism and
I think it should be removed.

Michael: Okay, cool, good 1.

Nikolay: Good, okay, good chat,
maybe some kind of entertaining

more than useful, but I enjoyed
it.

I enjoyed it.

Michael: Yeah, absolutely.

And feel free to send us any more
that you've got or that you

Nikolay: Maybe we should think
about specific areas like we did

for for now and maybe we should
explore some areas from this

perspective what can be unexpected

Michael: Yeah, for sure. Thanks
so much.

Nikolay.

Catch you next week.

Nikolay: Thank you.

You too.

More episodes

Chapters

Creators and Guests

What is Postgres FM?