Postgres FM

Nikolay and Michael discuss the superuser role in PostgreSQL — what it is, how and when it shouldn’t be used, and whether most cloud providers are right to not give us it (no prizes for guessing).
 
Here are some links to things they mentioned:

~~~

What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!

~~~

Postgres FM is brought to you by:
With special thanks to:

Creators & Guests

Host
Michael Christofides
Founder of pgMustard
Host
Nikolay Samokhvalov
Founder of Postgres AI

What is Postgres FM?

A weekly podcast about all things PostgreSQL

Michael: Hello and welcome to PostgresFM,
a weekly show about

all things PostgreSQL.

I am Michael, founder of pgMustard.

This is Nikolay, founder of Postgres.AI.

Hey Nikolay, how are you doing?

Nikolay: Hi Michael, I'm doing
great.

How are you?

Michael: I'm good also.

So this week, well I was in charge
of choosing, but I've actually

picked something that you suggested.

I was looking through all the listener
suggestions, all of the

ideas we've had in the past.

And this was one of yours, right?

Nikolay: I have no idea.

I already forgot.

It was my idea?

Okay.

Michael: So you suggested a while
back that we talk about super

user and especially super user
in the new normal, the new context

of cloud providers or managed services
and whether we, well,

the fact that we normally don't
have superuser access anymore

in those cloud environments, whether
we should, that kind of

thing.

So a little bit of a refresher on
what a superuser is, what it

can do and, and maybe why we don't
have it or why we should have

it.

Nikolay: Right.

Right.

Yeah.

Good questions.

Good.

Good.

What is a superuser, right?

First of all, it's just bypassing
all privilege, privilege checks,

right?

This is the idea.

Michael: Yeah, I looked it up in
the documentation.

Superuser bypasses all permission
checks except the right to

log in.

It's a dangerous privilege, should
not be used carelessly, it's

best to do most of your work as
a role that is not a superuser

and then talks about how to create
one.

Nikolay: So you can create a superuser
with no login flag, right?

Or...

Michael: I guess so.

Yeah.

Nikolay: Interesting.

Actually, I have a white spot in
my knowledge here, maybe.

Honestly.

But yeah, superuser is what people
sometimes use, not thinking

about permissions at all.

Even launching their services and
web applications using super

user database.

It's very, very bad practice.

Did you do that ever?

Michael: Yeah, it's the default,
right?

It's the default, well, you don't
know better.

Nikolay: Actually, yes, and it
means defaults here again are

not perfect because they don't
encourage you to create a non-superuser

database or user that you will
be using.

Also, I can remember some companies,
small and big ones, doesn't

matter, which give a single superuser
or not.

So it's different kinds of scenes,
different scenes.

One scene is, let's use one single
superuser for all people who have

DBA access, SRE access, like admin
access, and share it.

Or let's give everyone a super
user, but named one.

Maybe the first thing, like, it's
different, like, so different

scenes, like, use super user by
default when you work with

database, checking something.

And a different scene is like, let's
share an account, let's share

database role.

Both are not good things, right?

But at least if you separate roles
and create multiple superusers,

it's already slightly better.

Not slightly, actually.

It's significantly better because
you can see who is doing what,

at least, right?

Distinguish people.

But in general, yeah, Postgres,
how it's organized, it provokes

you to use superuser for everything
by default, and you need

to make efforts to go out of it.

Most people at least realize this
and at least stop using superuser

for application work.

This is number one thing to do.

Okay, you are going to use Superuser
for yourself because you

have all the rights, you are maybe
the owner of everything, right?

You own this database, so okay,
you have Superuser.

It's a separate question, should
you always log in as Superuser,

right?

Maybe you should log in as a normal
user, a regular one with limited

permissions and only if needed,
use superuser access.

But at least all application code
must not use superuser.

We use superuser, right?

This is obvious.

Michael: Well, and why?

Like, it's the danger, right?

Nikolay: Well, security.

My favorite topic, security, right?

I'm joking.

Michael: Well, security is a really
good reason, but I think

also, like, the danger of being
able to, like, drop things, being

able to destroy data, it's not
just a security issue, right?

Somebody could steal everything,
but they could also just destroy

everything and not steal anything.

Nikolay: Right.

Well, it's also a kind of type
of security.

Well, it might be reliability or
something, but it's still insecure

to give everyone...

It's not about somebody outside
of the company stealing data, but

even inside the company somebody
made a mistake.

It's also about, it means that
the work is not secure, right?

But I wish auditors dug into this
hole deeper.

Like many companies already reported,
I mean Postgres companies

reported, they have SOC2, and some
companies go to IPO, and they

have a lot of auditing activities
from external auditors, right?

And I know companies who are very
well-known, they have a bunch

of questions and some of these
questions sometimes are related to

Postgres, but when you look at
them, being Postgres experts, they

look funny, usually.

So yeah, I think it would be good
to create some standard or

something.

Actually, there is some standard,
right?

Crunchy Data they shared a big
PDF a few years ago, which is

aimed to make Postgres setup more
secure and it was for, I think,

for army or something like that development.

Michael: Yeah, the US military I think
collaborated with them on

it.

Nikolay: Right, this is a good
thing. And even better that they

shared it and this became public
so you can use it to grab some

things, and obviously a lot of things
are related to permissions

and what kind of database user
you use, right?

But if your company right now already
got SOC 2 or is doing this

or IPO or something, I would not
rely on external auditors to

say, okay, we are good here.

They suck there.

Their questions don't cover this
topic almost.

Or at least, what I saw so far over
the last 5 years.

I didn't see good questions, honestly.

Like, some of them were kind of
good, but this kind of topic,

like, do you use super user?

Do you distinguish users, I mean,
humans?

And so on and so forth.

Like, what is your model for these
privileges and so on?

Quite weak.

I'm not an expert, as usual, I
say.

I'm not an expert in security at
all.

This is not, like, this is maybe
one of the least favorite topics

in databases.

But it's a super important topic,
right?

So, yeah.

Okay, so when should we use superuser?

Michael: That was where I was thinking
we were gonna go.

So You mentioned a while back,
I think maybe on Twitter, maybe

just to me, I can't remember, that
there's a bunch of times when

you're using it, when you're maybe
with a client that's using

a managed service provider, that
it's frustrating to you that

you don't have superuser access.

So I was interested, like, when
are those times, like, What are

you trying to do that you can't
do without it or that's difficult

to do without it?

Nikolay: Yeah, there are things
that only a superuser can do,

obviously.

For example, COPY FROM PROGRAM
is a dangerous thing to do because

you can basically execute any shellcode
under the Postgres OS user,

Linux user, right?

And some things like I don't remember
exactly But there are certain

types of things where you need superuser,
definitely.

Michael: So I found a list.

I was looking at all the different
cloud providers and whether

they do or don't provide it.

And there's a really good list
in the Supabase docs of what

is unsupported in their highest
privileged role.

So they don't, let me just read
it quickly.

Supabase provides a default post
control to all instances deployed.

Superuser access is not given as
it allows destructive operations

to be performed on the database.

And so those unsupported operations
are.

Nikolay: Destructive, Okay.

Michael: Yeah, it was an interesting
choice of words.

CREATE ROLE WITH REPLICATION, CREATE
SUBSCRIPTION, CREATE EVENT

TRIGGER, COPY FROM PROGRAM, as
you mentioned, and of course,

ALTER USER WITH SUPERUSER, so
you can't make other users superusers.

Nikolay: Well, these are destructive
actions.

If they are destructive, let's
remove them from PostgreSQL.

This is judgment, like, let me
judge, right?

I'm the owner of this database,
or who is the owner of the database?

Let me drop my position, it's very
simple.

Two things.

First thing, for managed providers,
right?

Managed PostgreSQL providers.

If you don't run in a container,
question why not?

Right?

Run in a container.

Or actually, or in a separate VM.

You probably run it in a separate
VM or in a container at least,

right?

So, it's...

Michael: So, you mean like is it
isolated?

So, if somebody breaks out and,
or yeah, if somebody else is

being destructive, it doesn't affect
you, if, you know, a different

customer.

That, yeah.

Nikolay: A Firecracker microVM
or something, a lot of things,

but at least container.

So it's already isolated.

If it's isolated, second point,
give me superuser because I'm

the owner of this database, that's
it.

And they usually say, most of them
say, it's for your safety,

right?

But it's bullshit and a lie.

It's a lie.

Because I know for sure that inside
a big provider, very big provider,

teams, this topic pops up from
time to time.

And technical people usually say,
let's do it.

There are no big reasons to say,
like, we protect these users.

Like, when AWS gives you an EC2 instance,
it's a virtual machine

with Linux, They give you root,
right?

Michael: Oh, I don't know.

Nikolay: They give you root, of
course.

Well, I have root.

Then people say, okay, but for
Postgres, we have a lot of automation

and you can break it.

Well, if I break it, I break it.

So if I execute copy from program
as I did with Crunchy Bridge

and I move pg_wal directory, well
this is destructive action.

I copy from program to table and
I just MV PgWal2 PgWal2 and

I've got a panic, I mean database
got panic.

I've got a big joy observing that
I can destroy myself, I mean

my database because this is how
I think, okay, I own it at least

like, okay Not directly, but I
own it You know, there is a philosophical

There is a philosophical very good
statement You can truly own

only what you can destroy.

Ooh.

Right?

Right?

Michael: I've not heard it, but
it makes sense.

Nikolay: It makes sense in anything,
right?

For example, if you cannot destroy
your own company, if you're

a startup guy, founder, you do
not own it at all.

Maybe investors own it, right?

And you should realize it.

Who can destroy it?

This is like ownership without
the ability to destroy it.

So here we come to destructive
actions, but let me judge it.

Let me judge it and let me feel
it, right?

So, I don't accept any reasons
like that, we protect you, you

can destroy it, our support will
be fed up with questions, I

destroyed something.

Well, if you destroy something,
you recover from backups, that's

it.

But they also say, okay, if we
have a lot of automation, and

if we allow you to copy from programs,
superuser, for example,

then you will be able as a user
to see our automation pieces,

right?

Reverse engineer it.

This is already a real reason.

This is the real reason.

They don't tell you as a first
reason, but this is the number

1 reason, honestly.

They don't want you to see the
automation from inside.

Crunchy Bridge is great here.

They don't care.

I'm not sure if their product is
open source.

There are doubts on it, because
they stopped publishing images,

as I know, and so on.

But here, they give you superuser.

You can go copy from program and
explore, you know, like directory

layout and so on everything, find
which programs are there, probably

try to execute them.

Like it's your world, you really
own it.

They give you this, they charge
you extra.

You charge me extra and you protect
me from myself?

I don't know.

There will be time when people
start realizing it, and I hope

auditors will also realize it.

Who owns this database?

Michael: Yeah.

Of all the ones I checked, Crunchybridge
were the only ones that

supplied full superuser.

A lot of the other ones create
kind of a pseudo role just below

that with like super base.

I think couldn't find a list of
things that were removed from

others.

But Crunchy Bridge does definitely
deserve credit for that.

And especially if you consider
they were the ones that are publishing

the security guide.

I feel like that's a really good
argument for it's possible it's

just maybe people don't want to
do it I would say they've got

another possible advantage or at
least it may be like some other

reasons I I am not as convinced
as you as the support reasons

not a good 1 because even to investigate
was it the user that

messed up or was it us that messed
up is actually quite difficult

sometimes in support.

I don't know if you've ever had
that.

Nikolay: Do you use RDS or Cloud
SQL or something?

Do you know how calls with support
work?

Do you see how the calls are usually
organized?

Michael: I haven't ever used their
support.

Nikolay: I mean, any of managed
services support.

If you just try something, they
won't walk you step by step what

Happened.

They will offer you to recover
from backup, blah, blah, blah,

like that.

Of course, there are logs, right?

Michael: So...

And possibly ironically, I have
actually seen people have great

experience with Crunchy Data support.

This is not sponsored by Crunchy
Data, I promise, but there are

providers out there giving really
good support.

And I could imagine an argument
for, especially if you're a provider

that provides a free tier, for
example.

The ones that charge more than
it would cost you to host your

own, I can see the argument for
maybe they should provide super

user access.

If it's a free service...

Oh, go on.

Nikolay: So yesterday I had 1 million
rows in the table, but

today it's only like minus 10 rows.

Where are those 10 rows, right?

Should I go to support?

Because maybe it's a bug, maybe
it's a bug of their automation,

maybe it's a bug of Postgres itself,
rows disappeared.

Michael: If you've ever run a product,
you're going to get some

customers that come with support
questions that it turns out

it's nothing to do with your product.

It happens to all of us, right?

Nikolay: Or maybe I know I executed
a delete.

Or maybe I know my application
could have executed a delete.

This is some kind of problem.

It's the same question.

The same type of question.

That's why I say then they say
we're protective.

It's bullshit.

Michael: On that note, maybe to
move on slightly, I did actually

notice in the Crunchy docs that
for the superuser role, they

have pgAudit on by default to
log what it's doing.

Nikolay: Yeah. This is what I suspected.

Michael: That's interesting, right?

Nikolay: Yeah.

Well, yes.

If you enable it for all users,
so your logs will be flooded

with a lot of data, and like it
will become a performance

bottleneck very quickly.

So it makes sense to enable
it for superusers and capture

everything that's happening.

Michael: And of course, I suspect
you could change that if you've

got superuser access.

Nikolay: You can change that.

Michael: But I think the idea is
to help people help themselves

and also help you support them
if it may be alarming if it looks

like they are using it as their
application user or

Nikolay: something like that.

People are not stupid in general.

There are stupid people, but there
is only a minority of them.

And if you say we protect you,
it means like you don't trust

your own customers.

It's bullshit.

That's why I say it's bullshit.

So this is a good thing to have
good defaults.

For example, OK, superusers and
PGAudit is enabled.

If someone disables it, this record
that got disabled goes to

logs, so we have footprints, right,
of this action.

But in general, people can...

At least somehow customers I deal
with, I see opposite.

I sometimes think, oh, I need to
explain to you this, but they're

quite smart.

They're like, okay, we already
got this.

So they can understand what's happening.

My point is that they don't give
this to you to protect them,

not you.

They want to be protected.

They want to share what they got.

And this is like a business decision,
it's not a technical decision.

So they don't want to share automation
and how exactly they adjusted

Postgres and so on and so forth.

But I hope people will start realizing
this and cases like Country

Bridge will be more common and
people who are truly open-source

believers and lovers, they will
shift to a more open approach and

a more trustful approach, like trust
your customers, they can decide

if it's destructive or not, right?

And just keep everything open,
share your automation, and give

ownership and access to your customers.

In this case, the premium is usually
quite significant over the infrastructure

costs.

It would be reasonable to pay,
right?

Now there is an imbalance in this
world, as I see.

And I hope with time we also
realize it and start asking

questions like who is the actual owner
of this database?

Michael: Yeah, interesting, good
point.

But what do you think about free,
like, I completely take your

point on providers that are charging
a premium over what it would

cost you to run the service.

But what about the ones that are
offering you a free tier, like

Neon or Supabase, or even, there
are some newer ones as well

Nikolay: Well, how is it different

Michael: Well, I actually don't
know; maybe it's a premium

feature, maybe like, if it ends up
in more support

personally, even if you don't
agree that it

should end in more support, I think
it would overall

Nikolay: I don't think it will
be a big part of the whole picture.

won't be a big part I mean this
kind of questions like I destroyed

my database and I don't of course
if it got destroyed and you

didn't do anything, this is a good
question.

But if you have a pgAudit setup,
you can close and support

easily can point to small how-to,
how to understand what's happening,

and that's it.

And support usually doesn't look
inside your database, right?

This is your area.

Like RDS support, for example,
you need to have a high level

of support and convincing.

They usually check only the VM
and the underlying things and

the infrastructure things, right?

What's happening inside your database?

Who deleted rows or who moved,
for example, a pg_wal?

I still say this is the same level
of things.

It's your area.

Michael: Yeah, true.

I mean, there's, I think even...

Nikolay: I know many people won't
agree with me, actually.

Michael: I actually don't know
if they would anymore.

I think there's an increasing education
around data processing,

especially with all that the privacy
laws in the EU and in California,

around whose data is it and who's
processing the data, who owns

the data.

Nikolay: Right.

Yeah, in true spirit of some of
those laws, like if this is my

data, I should own it and be able
to destroy everything, not

only at logical level, like delete.

Maybe I want to destroy pg_wal
right now myself, you know, I

don't know, like this is true ownership.

And see how it works inside, like
I need to feel it, this is

true ownership.

In my opinion, inspect it.

Michael: I can see the argument
a little bit for things like

replication or you know some of
the let's say some of the cloud

providers they said they have for
example really easy single

checkbox high availability or
something and then you go and

destroy replication or you mess
something up that means that's

not working anymore.

Nikolay: Usually such providers
don't use Postgres replication.

This checkbox usually is not based
on Postgres replication.

It's usually underlying block storage
device replication, synchronous

usually.

So, this is what I know about RDS
and Cloud SQL.

So, usually this is lower level.

Michael: Cool.

Makes sense.

Maybe you can't even mess that
up.

Great.

Nikolay: So anyway, I know we live
in the world when people say

like...

Honestly, I also think most of
developers look at what RDS did

or CloudSQL did or Microsoft guys
did, and they just copy this

approach, not thinking deeper.

Crunchy is, as you mentioned, they
are security experts, obviously,

and they are brave.

So kudos, actually.

I already told this a couple of
times on Twitter.

This is great.

But others just copy what others,
like, say, smaller or new providers,

they just copy decisions from bigger
providers and copy their

arguments like this is to protect
users, this is to protect customers,

this is for your own safety, let
me decide what is destructive.

Michael: Yeah, well and I know
quite a few managed service providers

listen and people that work at
them it would be great to hear

from you if there's something we've
missed or if there's a If

there's a way of explaining this
that would be would be better

let us know.

Nikolay: Yeah, actually I can criticize
this easily because I

don't develop a managed service,
managed Postgres service, right?

Yeah.

Because in this case I would need
to be more careful because

I would have my situation.

But honestly, several times I saw
this, like how to implement,

how to protect, what we should
protect, if you provide supervision,

what kind of dangers exist.

And the list is obviously, this
is copied from programs, foreign

data wrappers probably, and so
on, like, dangerous parts.

So if you think about it, you can
implement a good model, and

I think, of course, maybe Postgres
could provide some additional

tools to restrict certain areas,
but if it's my database, I should

decide what to enable, what to
disable, and at which point for

whom, right?

Michael: Yeah I don't know if Postgres
can do anything about

this because ultimately these are
features that we need, like

someone needs to be able to do
them.

Well, maybe if no one needs to be
able to do them, it shouldn't

be in Postgres at all.

But if someone wants to be able
to do them, it seems silly to

me not like to disable it because
we want to be able to host

it in clouds?

Nikolay: Yeah, well, I want to
disable it sometimes.

For example, in Database Lab, we
have a job bot, right?

And we don't want to job bot to
like we want as much freedom

as possible for end users to execute
any SQL.

But if any SQL is copied from a
program, it can be dangerous

because all foreign data are hyper
because maybe users who are

end users, they are still inside
the same team, but maybe admins

doesn't want them to execute it.

And we just want to protect here.

Some users have full access, admin
decides, right?

But some users who work only at
this level, this SQL experimentation

level, they are restricted.

And at some point we removed all
possibilities of copy of program

from the developers and so on and
so on.

So they cannot do harm even if
they are inside the same team.

But the admin decides which permissions
to provide, right?

So in the end of the day, you own
this database and you decide

what to do.

When I say you, I mean admins,
right?

Because inside, bigger customer
there might be some additional

users, right?

And the good tool can provide the
ability to control those permissions.

Michael: Yeah, and that's what
we're saying, right?

Like, we're not saying RDS should
provide superuser to every

single person who has access to
the RDS dashboard, but to like

an admin.

That's what we're asking for, right?

1 person or a small group of users.

Nikolay: Right, right.

Well, of course there is a chain
reaction.

If, for example, RDS, for example,
Okay, we provide superusers,

but then what to do with access
to WAL files, backups, physical

backups, physical replication connection,
and so on and so on.

There is a chain reaction here,
right?

And they restrict you here, partially
because it's kind of vendor

locking, obviously.

So they also restrict a number
of things, like for example, recovery

target LSN, so you cannot perform
0 downtime upgrades with our

recipe, which involves recovery
target LSN.

You can only do with slot advancement,
we call it Instacart approach,

right, this article, Instacart
approach, which the opinions is

more risky.

Right, so recovery target LSN,
you need to do it, like you need

to provide it.

So, I mean, if you provide superuser,
this is a Pandora's box

in terms of decisions what to provide
to your users.

Michael: Well, it's a Pandora's box,
but it's making all of those

decisions at once and saying we
let you do anything.

Nikolay: Yeah, that's cool.

You are the owner, you decide what
to do.

We provide you automation, we charge
for it.

That's it.

Now we live in a different world,
we provide you some automation,

we hide a lot of capabilities Postgres
has from you, we restrict

you and charge for it.

It's not fair.

But this is the most popular approach
right now.

We had discussions about managed
services, right?

Michael: Yeah, second episode actually.

Maybe it's a little bit of a loop.

Nikolay: Right, right.

I feel the loop indeed, but a different
angle completely.

We had the previous episode also
echoing the first one.

Interesting.

So we know the next topic should
echo the third one.

Let's do it.

Okay.

Anyway, okay.

That's it.

Michael: Podcast wraparound.

Nikolay: Yeah.

Right.

So maybe this discussion was not
too technical, right?

But you can go read the documentation
if you want technical.

Michael: Yeah, and there were good
talks on roles and security

and just covering the basics.

There's loads of documentation
on this kind of thing.

We can include some links.

Nikolay: Right, yeah.

We had some philosophical discussion.

Do you own your database?

RDS users, do you own your databases?

Who owns it?

Maybe people don't care about it
until some auditors decide that

it's not truly ownership.

But probably, how AWS is organized,
probably they will never

raise this question.

Everyone is happy.

So I don't know.

Michael: And at least there's choice
out there now.

There's at least one provider that
does, and you can self-host.

You can even use the cloud and
manage it yourself.

Nikolay: I also say I want to own
my bloat, right?

Because AWS doesn't allow you
to take your physical level.

Michael: Logical replication.

Nikolay: Yeah, logical, you lose
the bloat your data files have.

And you cannot stop thinking about
bloat because it's Postgres.

Bloat is the center of architecture.

So you need to understand how this
works if you want good performance.

So, yeah, interesting.

So you work at an abstraction only,
some abstraction level.

And you cannot copy files.

Okay, maybe enough.

Michael: Thanks so much, Nikolay.

Thanks, everyone.

And see you next week.

Nikolay: Thank you.