A weekly podcast about all things PostgreSQL
Michael: Hello and welcome to PostgresFM,
a weekly show about
all things PostgreSQL.
I am Michael, founder of pgMustard.
This is Nikolay, founder of Postgres.AI.
Hey Nikolay, how are you doing?
Nikolay: Hi Michael, I'm doing
great.
How are you?
Michael: I'm good also.
So this week, well I was in charge
of choosing, but I've actually
picked something that you suggested.
I was looking through all the listener
suggestions, all of the
ideas we've had in the past.
And this was one of yours, right?
Nikolay: I have no idea.
I already forgot.
It was my idea?
Okay.
Michael: So you suggested a while
back that we talk about super
user and especially super user
in the new normal, the new context
of cloud providers or managed services
and whether we, well,
the fact that we normally don't
have superuser access anymore
in those cloud environments, whether
we should, that kind of
thing.
So a little bit of a refresher on
what a superuser is, what it
can do and, and maybe why we don't
have it or why we should have
it.
Nikolay: Right.
Right.
Yeah.
Good questions.
Good.
Good.
What is a superuser, right?
First of all, it's just bypassing
all privilege, privilege checks,
right?
This is the idea.
Michael: Yeah, I looked it up in
the documentation.
Superuser bypasses all permission
checks except the right to
log in.
It's a dangerous privilege, should
not be used carelessly, it's
best to do most of your work as
a role that is not a superuser
and then talks about how to create
one.
Nikolay: So you can create a superuser
with no login flag, right?
Or...
Michael: I guess so.
Yeah.
Nikolay: Interesting.
Actually, I have a white spot in
my knowledge here, maybe.
Honestly.
But yeah, superuser is what people
sometimes use, not thinking
about permissions at all.
Even launching their services and
web applications using super
user database.
It's very, very bad practice.
Did you do that ever?
Michael: Yeah, it's the default,
right?
It's the default, well, you don't
know better.
Nikolay: Actually, yes, and it
means defaults here again are
not perfect because they don't
encourage you to create a non-superuser
database or user that you will
be using.
Also, I can remember some companies,
small and big ones, doesn't
matter, which give a single superuser
or not.
So it's different kinds of scenes,
different scenes.
One scene is, let's use one single
superuser for all people who have
DBA access, SRE access, like admin
access, and share it.
Or let's give everyone a super
user, but named one.
Maybe the first thing, like, it's
different, like, so different
scenes, like, use super user by
default when you work with
database, checking something.
And a different scene is like, let's
share an account, let's share
database role.
Both are not good things, right?
But at least if you separate roles
and create multiple superusers,
it's already slightly better.
Not slightly, actually.
It's significantly better because
you can see who is doing what,
at least, right?
Distinguish people.
But in general, yeah, Postgres,
how it's organized, it provokes
you to use superuser for everything
by default, and you need
to make efforts to go out of it.
Most people at least realize this
and at least stop using superuser
for application work.
This is number one thing to do.
Okay, you are going to use Superuser
for yourself because you
have all the rights, you are maybe
the owner of everything, right?
You own this database, so okay,
you have Superuser.
It's a separate question, should
you always log in as Superuser,
right?
Maybe you should log in as a normal
user, a regular one with limited
permissions and only if needed,
use superuser access.
But at least all application code
must not use superuser.
We use superuser, right?
This is obvious.
Michael: Well, and why?
Like, it's the danger, right?
Nikolay: Well, security.
My favorite topic, security, right?
I'm joking.
Michael: Well, security is a really
good reason, but I think
also, like, the danger of being
able to, like, drop things, being
able to destroy data, it's not
just a security issue, right?
Somebody could steal everything,
but they could also just destroy
everything and not steal anything.
Nikolay: Right.
Well, it's also a kind of type
of security.
Well, it might be reliability or
something, but it's still insecure
to give everyone...
It's not about somebody outside
of the company stealing data, but
even inside the company somebody
made a mistake.
It's also about, it means that
the work is not secure, right?
But I wish auditors dug into this
hole deeper.
Like many companies already reported,
I mean Postgres companies
reported, they have SOC2, and some
companies go to IPO, and they
have a lot of auditing activities
from external auditors, right?
And I know companies who are very
well-known, they have a bunch
of questions and some of these
questions sometimes are related to
Postgres, but when you look at
them, being Postgres experts, they
look funny, usually.
So yeah, I think it would be good
to create some standard or
something.
Actually, there is some standard,
right?
Crunchy Data they shared a big
PDF a few years ago, which is
aimed to make Postgres setup more
secure and it was for, I think,
for army or something like that development.
Michael: Yeah, the US military I think
collaborated with them on
it.
Nikolay: Right, this is a good
thing. And even better that they
shared it and this became public
so you can use it to grab some
things, and obviously a lot of things
are related to permissions
and what kind of database user
you use, right?
But if your company right now already
got SOC 2 or is doing this
or IPO or something, I would not
rely on external auditors to
say, okay, we are good here.
They suck there.
Their questions don't cover this
topic almost.
Or at least, what I saw so far over
the last 5 years.
I didn't see good questions, honestly.
Like, some of them were kind of
good, but this kind of topic,
like, do you use super user?
Do you distinguish users, I mean,
humans?
And so on and so forth.
Like, what is your model for these
privileges and so on?
Quite weak.
I'm not an expert, as usual, I
say.
I'm not an expert in security at
all.
This is not, like, this is maybe
one of the least favorite topics
in databases.
But it's a super important topic,
right?
So, yeah.
Okay, so when should we use superuser?
Michael: That was where I was thinking
we were gonna go.
So You mentioned a while back,
I think maybe on Twitter, maybe
just to me, I can't remember, that
there's a bunch of times when
you're using it, when you're maybe
with a client that's using
a managed service provider, that
it's frustrating to you that
you don't have superuser access.
So I was interested, like, when
are those times, like, What are
you trying to do that you can't
do without it or that's difficult
to do without it?
Nikolay: Yeah, there are things
that only a superuser can do,
obviously.
For example, COPY FROM PROGRAM
is a dangerous thing to do because
you can basically execute any shellcode
under the Postgres OS user,
Linux user, right?
And some things like I don't remember
exactly But there are certain
types of things where you need superuser,
definitely.
Michael: So I found a list.
I was looking at all the different
cloud providers and whether
they do or don't provide it.
And there's a really good list
in the Supabase docs of what
is unsupported in their highest
privileged role.
So they don't, let me just read
it quickly.
Supabase provides a default post
control to all instances deployed.
Superuser access is not given as
it allows destructive operations
to be performed on the database.
And so those unsupported operations
are.
Nikolay: Destructive, Okay.
Michael: Yeah, it was an interesting
choice of words.
CREATE ROLE WITH REPLICATION, CREATE
SUBSCRIPTION, CREATE EVENT
TRIGGER, COPY FROM PROGRAM, as
you mentioned, and of course,
ALTER USER WITH SUPERUSER, so
you can't make other users superusers.
Nikolay: Well, these are destructive
actions.
If they are destructive, let's
remove them from PostgreSQL.
This is judgment, like, let me
judge, right?
I'm the owner of this database,
or who is the owner of the database?
Let me drop my position, it's very
simple.
Two things.
First thing, for managed providers,
right?
Managed PostgreSQL providers.
If you don't run in a container,
question why not?
Right?
Run in a container.
Or actually, or in a separate VM.
You probably run it in a separate
VM or in a container at least,
right?
So, it's...
Michael: So, you mean like is it
isolated?
So, if somebody breaks out and,
or yeah, if somebody else is
being destructive, it doesn't affect
you, if, you know, a different
customer.
That, yeah.
Nikolay: A Firecracker microVM
or something, a lot of things,
but at least container.
So it's already isolated.
If it's isolated, second point,
give me superuser because I'm
the owner of this database, that's
it.
And they usually say, most of them
say, it's for your safety,
right?
But it's bullshit and a lie.
It's a lie.
Because I know for sure that inside
a big provider, very big provider,
teams, this topic pops up from
time to time.
And technical people usually say,
let's do it.
There are no big reasons to say,
like, we protect these users.
Like, when AWS gives you an EC2 instance,
it's a virtual machine
with Linux, They give you root,
right?
Michael: Oh, I don't know.
Nikolay: They give you root, of
course.
Well, I have root.
Then people say, okay, but for
Postgres, we have a lot of automation
and you can break it.
Well, if I break it, I break it.
So if I execute copy from program
as I did with Crunchy Bridge
and I move pg_wal directory, well
this is destructive action.
I copy from program to table and
I just MV PgWal2 PgWal2 and
I've got a panic, I mean database
got panic.
I've got a big joy observing that
I can destroy myself, I mean
my database because this is how
I think, okay, I own it at least
like, okay Not directly, but I
own it You know, there is a philosophical
There is a philosophical very good
statement You can truly own
only what you can destroy.
Ooh.
Right?
Right?
Michael: I've not heard it, but
it makes sense.
Nikolay: It makes sense in anything,
right?
For example, if you cannot destroy
your own company, if you're
a startup guy, founder, you do
not own it at all.
Maybe investors own it, right?
And you should realize it.
Who can destroy it?
This is like ownership without
the ability to destroy it.
So here we come to destructive
actions, but let me judge it.
Let me judge it and let me feel
it, right?
So, I don't accept any reasons
like that, we protect you, you
can destroy it, our support will
be fed up with questions, I
destroyed something.
Well, if you destroy something,
you recover from backups, that's
it.
But they also say, okay, if we
have a lot of automation, and
if we allow you to copy from programs,
superuser, for example,
then you will be able as a user
to see our automation pieces,
right?
Reverse engineer it.
This is already a real reason.
This is the real reason.
They don't tell you as a first
reason, but this is the number
1 reason, honestly.
They don't want you to see the
automation from inside.
Crunchy Bridge is great here.
They don't care.
I'm not sure if their product is
open source.
There are doubts on it, because
they stopped publishing images,
as I know, and so on.
But here, they give you superuser.
You can go copy from program and
explore, you know, like directory
layout and so on everything, find
which programs are there, probably
try to execute them.
Like it's your world, you really
own it.
They give you this, they charge
you extra.
You charge me extra and you protect
me from myself?
I don't know.
There will be time when people
start realizing it, and I hope
auditors will also realize it.
Who owns this database?
Michael: Yeah.
Of all the ones I checked, Crunchybridge
were the only ones that
supplied full superuser.
A lot of the other ones create
kind of a pseudo role just below
that with like super base.
I think couldn't find a list of
things that were removed from
others.
But Crunchy Bridge does definitely
deserve credit for that.
And especially if you consider
they were the ones that are publishing
the security guide.
I feel like that's a really good
argument for it's possible it's
just maybe people don't want to
do it I would say they've got
another possible advantage or at
least it may be like some other
reasons I I am not as convinced
as you as the support reasons
not a good 1 because even to investigate
was it the user that
messed up or was it us that messed
up is actually quite difficult
sometimes in support.
I don't know if you've ever had
that.
Nikolay: Do you use RDS or Cloud
SQL or something?
Do you know how calls with support
work?
Do you see how the calls are usually
organized?
Michael: I haven't ever used their
support.
Nikolay: I mean, any of managed
services support.
If you just try something, they
won't walk you step by step what
Happened.
They will offer you to recover
from backup, blah, blah, blah,
like that.
Of course, there are logs, right?
Michael: So...
And possibly ironically, I have
actually seen people have great
experience with Crunchy Data support.
This is not sponsored by Crunchy
Data, I promise, but there are
providers out there giving really
good support.
And I could imagine an argument
for, especially if you're a provider
that provides a free tier, for
example.
The ones that charge more than
it would cost you to host your
own, I can see the argument for
maybe they should provide super
user access.
If it's a free service...
Oh, go on.
Nikolay: So yesterday I had 1 million
rows in the table, but
today it's only like minus 10 rows.
Where are those 10 rows, right?
Should I go to support?
Because maybe it's a bug, maybe
it's a bug of their automation,
maybe it's a bug of Postgres itself,
rows disappeared.
Michael: If you've ever run a product,
you're going to get some
customers that come with support
questions that it turns out
it's nothing to do with your product.
It happens to all of us, right?
Nikolay: Or maybe I know I executed
a delete.
Or maybe I know my application
could have executed a delete.
This is some kind of problem.
It's the same question.
The same type of question.
That's why I say then they say
we're protective.
It's bullshit.
Michael: On that note, maybe to
move on slightly, I did actually
notice in the Crunchy docs that
for the superuser role, they
have pgAudit on by default to
log what it's doing.
Nikolay: Yeah. This is what I suspected.
Michael: That's interesting, right?
Nikolay: Yeah.
Well, yes.
If you enable it for all users,
so your logs will be flooded
with a lot of data, and like it
will become a performance
bottleneck very quickly.
So it makes sense to enable
it for superusers and capture
everything that's happening.
Michael: And of course, I suspect
you could change that if you've
got superuser access.
Nikolay: You can change that.
Michael: But I think the idea is
to help people help themselves
and also help you support them
if it may be alarming if it looks
like they are using it as their
application user or
Nikolay: something like that.
People are not stupid in general.
There are stupid people, but there
is only a minority of them.
And if you say we protect you,
it means like you don't trust
your own customers.
It's bullshit.
That's why I say it's bullshit.
So this is a good thing to have
good defaults.
For example, OK, superusers and
PGAudit is enabled.
If someone disables it, this record
that got disabled goes to
logs, so we have footprints, right,
of this action.
But in general, people can...
At least somehow customers I deal
with, I see opposite.
I sometimes think, oh, I need to
explain to you this, but they're
quite smart.
They're like, okay, we already
got this.
So they can understand what's happening.
My point is that they don't give
this to you to protect them,
not you.
They want to be protected.
They want to share what they got.
And this is like a business decision,
it's not a technical decision.
So they don't want to share automation
and how exactly they adjusted
Postgres and so on and so forth.
But I hope people will start realizing
this and cases like Country
Bridge will be more common and
people who are truly open-source
believers and lovers, they will
shift to a more open approach and
a more trustful approach, like trust
your customers, they can decide
if it's destructive or not, right?
And just keep everything open,
share your automation, and give
ownership and access to your customers.
In this case, the premium is usually
quite significant over the infrastructure
costs.
It would be reasonable to pay,
right?
Now there is an imbalance in this
world, as I see.
And I hope with time we also
realize it and start asking
questions like who is the actual owner
of this database?
Michael: Yeah, interesting, good
point.
But what do you think about free,
like, I completely take your
point on providers that are charging
a premium over what it would
cost you to run the service.
But what about the ones that are
offering you a free tier, like
Neon or Supabase, or even, there
are some newer ones as well
Nikolay: Well, how is it different
Michael: Well, I actually don't
know; maybe it's a premium
feature, maybe like, if it ends up
in more support
personally, even if you don't
agree that it
should end in more support, I think
it would overall
Nikolay: I don't think it will
be a big part of the whole picture.
won't be a big part I mean this
kind of questions like I destroyed
my database and I don't of course
if it got destroyed and you
didn't do anything, this is a good
question.
But if you have a pgAudit setup,
you can close and support
easily can point to small how-to,
how to understand what's happening,
and that's it.
And support usually doesn't look
inside your database, right?
This is your area.
Like RDS support, for example,
you need to have a high level
of support and convincing.
They usually check only the VM
and the underlying things and
the infrastructure things, right?
What's happening inside your database?
Who deleted rows or who moved,
for example, a pg_wal?
I still say this is the same level
of things.
It's your area.
Michael: Yeah, true.
I mean, there's, I think even...
Nikolay: I know many people won't
agree with me, actually.
Michael: I actually don't know
if they would anymore.
I think there's an increasing education
around data processing,
especially with all that the privacy
laws in the EU and in California,
around whose data is it and who's
processing the data, who owns
the data.
Nikolay: Right.
Yeah, in true spirit of some of
those laws, like if this is my
data, I should own it and be able
to destroy everything, not
only at logical level, like delete.
Maybe I want to destroy pg_wal
right now myself, you know, I
don't know, like this is true ownership.
And see how it works inside, like
I need to feel it, this is
true ownership.
In my opinion, inspect it.
Michael: I can see the argument
a little bit for things like
replication or you know some of
the let's say some of the cloud
providers they said they have for
example really easy single
checkbox high availability or
something and then you go and
destroy replication or you mess
something up that means that's
not working anymore.
Nikolay: Usually such providers
don't use Postgres replication.
This checkbox usually is not based
on Postgres replication.
It's usually underlying block storage
device replication, synchronous
usually.
So, this is what I know about RDS
and Cloud SQL.
So, usually this is lower level.
Michael: Cool.
Makes sense.
Maybe you can't even mess that
up.
Great.
Nikolay: So anyway, I know we live
in the world when people say
like...
Honestly, I also think most of
developers look at what RDS did
or CloudSQL did or Microsoft guys
did, and they just copy this
approach, not thinking deeper.
Crunchy is, as you mentioned, they
are security experts, obviously,
and they are brave.
So kudos, actually.
I already told this a couple of
times on Twitter.
This is great.
But others just copy what others,
like, say, smaller or new providers,
they just copy decisions from bigger
providers and copy their
arguments like this is to protect
users, this is to protect customers,
this is for your own safety, let
me decide what is destructive.
Michael: Yeah, well and I know
quite a few managed service providers
listen and people that work at
them it would be great to hear
from you if there's something we've
missed or if there's a If
there's a way of explaining this
that would be would be better
let us know.
Nikolay: Yeah, actually I can criticize
this easily because I
don't develop a managed service,
managed Postgres service, right?
Yeah.
Because in this case I would need
to be more careful because
I would have my situation.
But honestly, several times I saw
this, like how to implement,
how to protect, what we should
protect, if you provide supervision,
what kind of dangers exist.
And the list is obviously, this
is copied from programs, foreign
data wrappers probably, and so
on, like, dangerous parts.
So if you think about it, you can
implement a good model, and
I think, of course, maybe Postgres
could provide some additional
tools to restrict certain areas,
but if it's my database, I should
decide what to enable, what to
disable, and at which point for
whom, right?
Michael: Yeah I don't know if Postgres
can do anything about
this because ultimately these are
features that we need, like
someone needs to be able to do
them.
Well, maybe if no one needs to be
able to do them, it shouldn't
be in Postgres at all.
But if someone wants to be able
to do them, it seems silly to
me not like to disable it because
we want to be able to host
it in clouds?
Nikolay: Yeah, well, I want to
disable it sometimes.
For example, in Database Lab, we
have a job bot, right?
And we don't want to job bot to
like we want as much freedom
as possible for end users to execute
any SQL.
But if any SQL is copied from a
program, it can be dangerous
because all foreign data are hyper
because maybe users who are
end users, they are still inside
the same team, but maybe admins
doesn't want them to execute it.
And we just want to protect here.
Some users have full access, admin
decides, right?
But some users who work only at
this level, this SQL experimentation
level, they are restricted.
And at some point we removed all
possibilities of copy of program
from the developers and so on and
so on.
So they cannot do harm even if
they are inside the same team.
But the admin decides which permissions
to provide, right?
So in the end of the day, you own
this database and you decide
what to do.
When I say you, I mean admins,
right?
Because inside, bigger customer
there might be some additional
users, right?
And the good tool can provide the
ability to control those permissions.
Michael: Yeah, and that's what
we're saying, right?
Like, we're not saying RDS should
provide superuser to every
single person who has access to
the RDS dashboard, but to like
an admin.
That's what we're asking for, right?
1 person or a small group of users.
Nikolay: Right, right.
Well, of course there is a chain
reaction.
If, for example, RDS, for example,
Okay, we provide superusers,
but then what to do with access
to WAL files, backups, physical
backups, physical replication connection,
and so on and so on.
There is a chain reaction here,
right?
And they restrict you here, partially
because it's kind of vendor
locking, obviously.
So they also restrict a number
of things, like for example, recovery
target LSN, so you cannot perform
0 downtime upgrades with our
recipe, which involves recovery
target LSN.
You can only do with slot advancement,
we call it Instacart approach,
right, this article, Instacart
approach, which the opinions is
more risky.
Right, so recovery target LSN,
you need to do it, like you need
to provide it.
So, I mean, if you provide superuser,
this is a Pandora's box
in terms of decisions what to provide
to your users.
Michael: Well, it's a Pandora's box,
but it's making all of those
decisions at once and saying we
let you do anything.
Nikolay: Yeah, that's cool.
You are the owner, you decide what
to do.
We provide you automation, we charge
for it.
That's it.
Now we live in a different world,
we provide you some automation,
we hide a lot of capabilities Postgres
has from you, we restrict
you and charge for it.
It's not fair.
But this is the most popular approach
right now.
We had discussions about managed
services, right?
Michael: Yeah, second episode actually.
Maybe it's a little bit of a loop.
Nikolay: Right, right.
I feel the loop indeed, but a different
angle completely.
We had the previous episode also
echoing the first one.
Interesting.
So we know the next topic should
echo the third one.
Let's do it.
Okay.
Anyway, okay.
That's it.
Michael: Podcast wraparound.
Nikolay: Yeah.
Right.
So maybe this discussion was not
too technical, right?
But you can go read the documentation
if you want technical.
Michael: Yeah, and there were good
talks on roles and security
and just covering the basics.
There's loads of documentation
on this kind of thing.
We can include some links.
Nikolay: Right, yeah.
We had some philosophical discussion.
Do you own your database?
RDS users, do you own your databases?
Who owns it?
Maybe people don't care about it
until some auditors decide that
it's not truly ownership.
But probably, how AWS is organized,
probably they will never
raise this question.
Everyone is happy.
So I don't know.
Michael: And at least there's choice
out there now.
There's at least one provider that
does, and you can self-host.
You can even use the cloud and
manage it yourself.
Nikolay: I also say I want to own
my bloat, right?
Because AWS doesn't allow you
to take your physical level.
Michael: Logical replication.
Nikolay: Yeah, logical, you lose
the bloat your data files have.
And you cannot stop thinking about
bloat because it's Postgres.
Bloat is the center of architecture.
So you need to understand how this
works if you want good performance.
So, yeah, interesting.
So you work at an abstraction only,
some abstraction level.
And you cannot copy files.
Okay, maybe enough.
Michael: Thanks so much, Nikolay.
Thanks, everyone.
And see you next week.
Nikolay: Thank you.