Postgres FM

Nikolay and Michael discuss compression in Postgres — what's available natively, newer algorithms in recent versions, and several extensions with compression features.
 
Here are some links to things they mentioned:

~~~

What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!

~~~

Postgres FM is produced by:

With special thanks to:
  • Jessie Draws for the elephant artwork 

Creators & Guests

Host
Michael Christofides
Founder of pgMustard
Host
Nikolay Samokhvalov
Founder of Postgres AI

What is Postgres FM?

A weekly podcast about all things PostgreSQL

Nikolay: Hello, hello.

This is Postgres FM, episode number
107.

My name is Nikolay, founder of
Postgres.AI, and as usual, my

co-host is Michael, pgMustard.

Hi, Michael.

Michael: Hello, Nikolay.

Nikolay: So, you chose the second
most boring topic, in my opinion,

after security.

Tell us what it is.

Michael: Yeah, I can blame our
listeners for this one.

We had a… In my opinion…

Nikolay: Blame someone else, right?

Michael: Yeah, exactly.

Always.

That's the first rule.

Nikolay: This is my favorite methodology
in troubleshooting,

of incidents.

Michael: No, blame this culture,
right?

So we had a great listener suggestion
to talk about compression.

And I guess it's kind of surprising
we haven't covered it yet.

We've covered various topics around
this, but they are specifically

in the context of a couple of extensions
that offer compression

options.

I thought it was a good topic to
cover generally.

We can talk about all the different
things where compression

is available in Postgres.

Kind of a broad but shallow topic
maybe this time.

Nikolay: Yeah, we will talk about
physical, like not lower level

of compression, not like modern
ways to compress data when you

tell some LLM, I have a lot of
data in this table, in this column,

it's a huge chunk of text, let's
just summarize them and drop

the detailed texts and so on, right?

It's actually my favorite way to
compress.

Michael: Well, that's a good point,
actually.

I guess I assumed automatically
we were talking about what would

be called lossless compression.

Yeah, so like with images.

Nikolay: This is definitely lossy.

This will lose some details for
sure.

And it will use some details that
don't exist.

Not only it's glossy, it's also
glossy, I guess.

So,

Michael: Does it stand for lossy
lying machines?

Is that what they stand for?

Lutsky

Nikolay: Yeah, yeah.

So we talk about lossless, lower
level, transparent compression

when we just enable something and
data suddenly takes less disk

space or space when we send a network,
like sending something

in transit, so to speak.

And we are able to uncompress it
without any losses.

But it's all fully automatic and
users just use it, enabling

some features probably.

This is our focus today, right?

Michael: Yeah, exactly.

And I think that's a good transition
into kind of the 2 main

benefits.

And I think they're obviously related,
but I think they are somewhat

different and probably in some
ways trade off against each other

a little bit.

1 is compression for the sake of
storage.

So if we have very repetitive data
or data that compresses really

well, we could maybe spend a lot
less money and less resources

by compressing it.

But so storage is obviously a big
1.

But there's also the performance
side of it.

If it can take up less space, it
might be, depending on where

our bottlenecks are, it might be
that overall the cost or speed

degradation of compressing and
uncompressing or decompressing

the other side is still faster
if we've had to only transport

a lot less data or do calculations
on a lot less data, that kind

of thing.

Nikolay: Let's maybe start with
things we have for many years

in Postgres.

And then discuss some new stuff.

And then discuss and compare what's
beneficial, what's less beneficial.

We have, first of all, compression
at WAL level for full page

writes, full page inserts, right?

FPIs.

And full page inserts need to fix
the problem of difference between

the size of buffers, the pages
in memory, 8 kilobytes in most

cases, and the size of block and
file system, very often 4 kilobytes,

x4 for example, right?

And to avoid partial writes in
the case of failures, Postgres,

after each checkpoint, if the buffer
was changed for the first

time until the next checkpoint,
Postgres doesn't write only the

change itself, it writes the whole
buffer.

And if by default WAL compression
is not enabled, it means whole

buffer 8 kilobytes is written as
is to WAL, consumes 8 kilobytes,

right?

But if we enable WAL compression,
this page is compressed.

And in my opinion, in most cases,
if we talk about significant

load, we should consider enabling
WAL compression.

It can be beneficial, especially
if you have short distance between

checkpoints, because in this case
you have more frequent full-page

writes happening.

Of course if you increase distance,
then maybe the same page

is becoming dirty.

It means writes happen inside it
many times, multiple times,

and only for the first change it
will be full page write.

Subsequent changes will be written
only like deltas, right?

Only tuple, which was changed.

But if you have quite frequent
checkpoints, Enabling WAL compression,

you can significantly reduce the
size of WAL written.

And this has very good positive
consequences, such as less data

to write to backups.

WAL archiving, archive command
will archive less bytes, fewer

maybe like gigabytes per hour,
for example.

And second, replication as well.

WAL is smaller, so replication
has less to transmit over the network.

Of course, compression needs CPU
cycles and decompression needs

CPU cycles.

And I saw on Twitter, some people
mentioned they had problems

when they enabled WAL compression.

But in my cases, I observed we
always decided to switch it on

and CPU overhead was worth it.

So there is a trade-off here, CPU
versus I-O, and we always chose

in favor of less I-O.

And with synthetic tests, I just
showed you before we started

this recording, I showed you we
had a simple PgBench experiment

with max WAL size 4GB, checkpoint
amount 15 minutes.

We saw WAL reduction, I think,
on regular PgBench workloads,

which include writes, of course,
inserts, updates, deletes.

We saw 3 times less WAL created,
generated, or written, it's

the same, right, when we enable
WAL compression, which is a

huge benefit, 3 times less WAL.

But if you had the checkpoint tuning,
as we discussed in some

of our last, like, previous episodes,
if you had it and distance

between checkpoints is quite large,
especially if you have Patroni

and modern Postgres, which in case
of failover or switchover

doesn't require restart of all
nodes, you can afford a bigger distance

between checkpoints and have a maximum
size set of, I don't know,

like 100 gigabytes.

In this case, it's unlikely the
benefit will be so huge, not

3x.

Of course, it also depends a lot
on the data nature, right?

If it's very repetitive, it's easier
to compress.

Michael: Well, yeah, I mean, that's
pretty true across the board

for compression isn't it?

On the WAL front, 1 thing I think
we probably didn't talk about

when we talked about WAL and
checkpoint tuning is on the compression

side as of Postgres 15 which I'm
guessing is pretty new so you

might not have this kind of in
the wild experience yet, but we

do have more options now.

So in the past, we could only turn
WAL compression on or off.

And now we have, instead of on,
we have 3 different options.

We have the previous option, which
is the PGLZ algorithm, but

we also have LZ4 and ZStandard
options now.

And for the people that had, that
kind of complained about the

CPU side of things, 1 option might
be, LZ4 I believe would be

less CPU intensive.

It might not be faster.

Huh?

Nikolay: Faster and more lightweight
in terms of CPU consumption.

But the compression ratio is worse.

Michael: Well, it's not that different
to PGLZ actually.

It's yeah, but compared to some
other modern compression algorithms,

it tends to lose.

But compared to PGLZ, you don't
lose much.

I think it is slightly worse on
average, obviously depends a

lot on the data.

But I think it's a pretty good
option if you have that issue.

And if you're testing, turning
this on, on a modern version of

Postgres, worth trying at least,
whether, depending on your constraints,

trying those different new options.

Nikolay: Yeah, that's a great point.

And 2 ideas here I have in mind.

First of all, I remember these
options, the standard and LZ4,

they require compile flags to be
turned on.

I guess, as you rightfully said,
I don't have a lot of...

It's fresh things, so I don't have
production, rich production

experience.

But these flags, I guess they are
turned on in the official apt

packages, right?

If you just install it on fresh
Ubuntu.

I hope it's so, I hope it's so,
but worth checking.

This option should be turned on.

So Postgres should be compiled
with support of these 2 algorithms.

And second thing, I think it's
worth new research.

Maybe with our bot it should be
quite easy, and we should just

research maybe with different workloads
and maybe with different

Maxwell size.

As I said, it's very important
in this case, the distance between

checkpoints.

And just with regular PagerBench
we could check all available

algorithms on fresh Postgres versions,
and maybe draw some charts,

right?

Plot some charts.

So yeah, we just like hands, but
bot is definitely ready for

that.
So I think it's a great direction.

I wish I had more hands, by the
way.

Like I wanted to say we are looking
for maybe part-time database

engineers, people who want some
fun with this kind of work, like

research.

We usually do it public, so it's
kind of interesting for community

as well.

So if some people listening to
this podcast want to participate

and work part-time with us with
Postgres.AI, definitely would

be interested in discussing.

So maybe we will have more hands
and do this benchmark, for example.

Michael: What's the best way for
them to get in touch with you?

Nikolay: Email or Twitter
or LinkedIn, but email

nik@postgres.ai is always a good way
to contact me.

Michael: On the packaging front,
I don't know, I haven't checked

whether they have compiled Postgres
with those flags, but I was

pleasantly surprised.

I'm on Google Cloud SQL for my
own, for the pgMustard database,

and recently upgraded to Postgres
16, finally.

And was pleasantly surprised that
I was able to turn on WAL

compression with LZ4 and TOAST,
I was able to switch our default

TOAST compression algorithm to
LZ4 as well, which is really cool.

Nikolay: On 16 you said?

Michael: I'm on 16, but I think
those are available as of version

15 for WAL and version 14 for
TOAST.

Nikolay: That's great.

So what benefits, like, do you
remember some numbers?

Michael: So in my, I didn't do
like extensive benchmarking, but

like the thing we use it for most
is we have saved plans.

So EXPLAIN plans compress really
well.

It's a lot of repeat.

Nikolay: Yeah, many.

Yeah, it makes sense.

So it's in JSON, right?

Michael: Well, both like text plans
compress well too, But JSON

plans compress extremely well.

But obviously JSON plans are bigger
in the first place.

So there's not...

Nikolay: Yeah, that's good.

Michael: So yeah, it compresses
really well.

But the main thinking was, we don't
mind spending a bit more

on storage for those if the speed
of retrieving them will be

quicker.

So people save a plan, they might
share it with somebody on their

team and that person needs to load
it.

Obviously, it's like we're talking
small amounts of time, But

if the storage wasn't that different
and the speed was faster,

I was happy to make the switch.

And yeah, it turned out the storage
was slightly worse on average

for the plans I tested with LZ4,
but the retrieval speed was

faster.

So I bit the bullet and did it.

The cool thing is, the thing I
didn't realize, I thought it would

be a complex migration.

But like, what do you do?

Like, I thought you might have
to change existing plans or existing

data.

But you don't.

If you change the setting, it applies
to new data.

Nikolay: Right.

Yeah, actually, this is a good
point we didn't mention.

So we're shifting to discussion
of storage, compression, and

host, right?

But it's a good point we forgot
to mention about wall compression.

It does require restart.

So you can switch it on, switch
it off, and all new writes will

happen according to the new setting.

Of course, you need to send the
SIGHUP signal or select PgReloadConf,

so configuration changes are applied
without any restart, which is

great.

And it also means it's easier to
try.

If you have monitoring, if you're
prepared to roll back, it's

easier to try.

And if things go wrong, you can
return.

So TOAST compression is interesting.

And so, again, like, sorry, I'm
not paying attention to details

today somehow.

You chose this standard, right?

Michael: I actually chose LZ4.

I was more interested in the speed
of compression and speed of

retrieval than I was for total
size on disk.

Nikolay: Meanwhile, I asked our
bot to double-check how apt packages,

official pgdg packages are created.

Of course, these 2 options are
there, so if you install Postgres

on Ubuntu, you can try various
compression algorithms.

Well, just to double-check.

Okay, so TOAST, what else?

We don't have a good ability to
compress table data, right?

By default in Postgres we don't
have it, And it's actually not

a simple topic in restore.

In restore, we have, yeah, so tuple
by tuple is stored and it

can go in different pages.

So like, If we compress it, we
can transparently compress it

switching to ZFS, for example.

We saw benefits in terms of disk
space, like 10 to 30 percent,

depending on the data nature.

But ZFS brings new challenges for
administration, for sure.

It's still good to shave off 30%.

And this is what you get by default
if you install DBLab because

it's on ZFS.

So if you have, for example, a terabyte
size database in DBLab,

it will look like 700 to 800 gigabytes
only.

It's much better.

But yeah, so But in Postgres itself,
there are no good compression

options for heap.

Michael: Except for single large,
like if you have 1 column that's

large, I think toast is very good,
but not for like, smaller

values.

Nikolay: Oh, actually, yeah, big
values are compressed in heap,

right?

Now I remember before going to
TOAST, like Postgres tries to,

to, to squeeze, like, to, to fit
them in size, like 2 kilobytes

and have 4, 4, roughly 4 tuples
per page.

Roughly, like it's like kind of
things I remember.

Michael: Once it's compressed,
it's under 1 kilobyte or

something like that, they put it
in line on the page.

Nikolay: Or in line.

Michael: Yeah.

It just means that if you have
values that are multiple kilobytes

and compress well, even the default
PGLZ will give you quite

a lot of compression out of the
box, transparently in Postgres.

It's just, if you have lots of
repeating small data, like time

series data, that could compress
as a whole very well, the row

store...

Nikolay: But this compression is
applied to a single tuple, single

row version.

So a single row version, it means
that we only have 1 timestamp,

for example, 1 temperature, and
so on.

Several different columns, maybe
a couple of timestamps, but

different nature of timestamps.

For example, created that and I
don't know, like registered that,

something different timestamp.

And Postgres tries to compress
the tuple and TOAST, like we didn't

cover and there's no goal to cover
it deeply.

We had another episode on it.

Right.

So Postgres tries to compress.

If it doesn't fit, it shrinks it
already and chunks are stored

in separate so-called TOAST table,
which is actually also a regular

table, which is kind of invisible
to user, but you can inspect

it if you want as well.

Then the compression and reconstruction
of tuple is occurring

when it's needed.

But what I'm trying to say is there is
no...

For the heap itself, there are no rich
capabilities to control compression.

And even if we had them, expected
benefits would be not high

compared to analytical column store
databases, where, for example,

we have all temperatures or all
timestamps stored in a separate

file.

Only this column for all rows is
stored here.

They all are temperatures and maybe
these temperatures are coming

from, for example, if they're coming
from a single source, they

don't jump.

I don't remember the word in English,
sorry, but the changes

are not acute.

If they change, for example, if
temperature is increasing, probably

it will be increasing for some
time.

It means that compression could
be done using various algorithms,

for example, applying deltas and
storing only deltas.

We will probably discuss TimescaleDB
soon because this is pointing

in the direction of TimescaleDB
and what it provides.

But so for ColumnStore you can
compress like 10x which is not

possible

Michael: for rows.

Yeah, or more even.

But I think you're missing the
multiples interesting, right?

But you've missed the fact that,
like, let's say we had a row

that is, let's say our tuples are
like a megabyte, but 99.9%

of that is a single JSON column,
and we have 10 other tiny little

integer columns.

We get way more than 10x compression
ratio just with TOAST, you

know, as if that block is as well.

Yeah.

So, I

Nikolay: don't...

As you know, probably, I, like,
when I was very young, I participated

a little bit in XML function and
data type development in Postgres,

and I remember that time XML compression
was the thing, big thing.

And I even remember hardware accelerators
companies were selling

to compress XML on the fly transparently.

It's crazy.

It's because a lot of structural
pieces of the values can be

compressed well in XML for sure,
but less than JSON, but still

also all those parentheses, braces,
quotes and so on.

Or maybe it's also some, if it's
JSON, not JSON B, a lot of white

spaces.

Right?

Yeah.

Yeah.

So, yeah.

If you have JSON and, yeah, so
in some cases, compression and

TOAST, like tuning it, it's interesting.

And maybe also we need benchmarks,
so maybe for JSON as well.

It's good.

Michael: Yeah, it
Nikolay: would be

Michael: interesting, but what
I mean is more like the compression

ratio is a lot, is varied, obviously
extremely dependent on the

data.

But I think

Nikolay: I'm very curious what
you have, for example, if you

know that some old values are not
compressed and some new values

are compressed, you check how much
storage is occupied by, I

don't know, by like 1000 rows or
so.

It's worth checking.

Michael: Well, you can't, it's
like impossible to store large

values in Postgres without, actually
no, you can turn compression

off, but TOAST, you can turn compression
off with TOAST, but

the default is on.

So I have never tried it without.

So I don't have that to compare,
but just give you an idea.

We didn't use to compress plans
in so we only store plans in

local storage by default, I can
browse local storage, and that's

an extremely limited resource.

Like in some I think in Firefox,
it's like 5 or 10 megabytes,

depending on the version.

So that's not very much when you're
talking large query plans.

Like, obviously, it's a lot for
most people's query plans, but

some people will come up with query
plans that are like, dozens

of megabytes.

Nikolay: Varlena data types accept
up to 1 gigabyte, right.

But as I remember, It's better
not to go beyond a couple of hundred

megabytes because the performance
will become terrible.

Michael: Well, just to give you
an idea though, quickly, these

plans were compressing from megabytes
down to kilobytes.

Like it was easily, yeah, It was
more than 95%.

I think in some cases, like, yeah.

Nikolay: That's cool.

But still, it's all compression
of only 1 tuple and actually

1 value, not tuple.

Michael: 1 value, yeah.

Nikolay: But It feels like it would
make sense to have some compression

at maybe at page level as well
and so on.

This is what ZFS provides transparently
if you put Postgres on

top of it, right?

So compressing pages, I think,
even if it's still a raw store,

maybe having these abilities...

I remember Peter Geoghegan talked a
lot about new features in MySQL

many years ago.

I think MySQL has more settings
in this area for compression

of storage.

Have you heard or remember something
from it?

We have mostly questions today,
right?

Michael: I didn't know that.

Well, they seem to be a bit further
ahead on the storage engine

side of things, don't they?

Like they've had multiple storage
engines for quite a lot longer

than...

Well, like we don't have...

But yeah, so it wouldn't surprise
me if they were ahead on that

front.

But yeah, you mentioned Timescale,
it feels like we've had a

history of a few extensions that
have offered the option for

columnar storage.

Nikolay: Before going there, let's
just touch a little bit on

options that Postgres could have,
and there are some discussions.

Like, what else could be compressed
inside?

For example, temporary files.

Yes, good idea.

Michael: But we should also probably
talk about things that we

actually haven't mentioned, like
backup files, pgdump files.

Nikolay: I know, this is already...

Yeah, I wanted to do it, but look,
before going to backups or

dumps, backups and dumps, it's
already going outside of Postgres.

But imagine we have Postgres, we
run some queries, we discuss

the WAL, which is the first thing
we should discuss because

data is written there first before
it's written in data files

or pages and so on.

So we discussed WAL, we discussed
the storage itself, TOAST,

this is what we have and that's
it.

Not at page level, but it can be
achieved by a file system, but

that's it.

Indexes, I have no idea.

Maybe deduplication, which was
done in Postgres 13, 14, but by

Peter Geoghegan, Anastasia Lubyannikova,
maybe this can be considered

as compression actually, right?

Michael: That's a good shout, yeah.

Nikolay: Native compression, you
know, like pieces of compression

because it occupies less space,
right?

It's kind of...

Why not?

So, But it's optimization, but
still.

And last time we discussed, what
was the topic last week?

Michael: Out of disk.

Nikolay: Out of disk, right.

And remember we mentioned, I named
it exotic and some listener

commented on YouTube, By the way,
thank you all those who write

something on YouTube or on Twitter
or LinkedIn.

It's very good to receive feedback.

So they mentioned that temporary
files is not that exotic.

And if you run huge, heavy queries
so they had it, you can be

out of disk space because temporary
files were huge and consumed

a lot of disk.

And the compression of them makes
total sense to me, right?

What do you think?

We don't have it, right?

Michael: I was watching a really
good video that you sent me

a link to by Andrei

Nikolay: Borodin.

Right, right.

Michael: Cybertech 2023.

Yes.

And I watched it last night and
he mentioned that in order to

be...

I think there was a problem with
durability.

If you want to be...

Nikolay: Forget
about problems.

This is like...

I'm just...

It doesn't make sense.

My answer is

Michael: yes, it does.

David Willis He started doing the
work and it's more complex

than it might sound.

But yeah, it makes sense, but it's
complicated.

Nikolay: Full disclaimer, I just
chatted with Andrei this morning

and this is what he told me and
I'm just shamelessly using his

ideas.

So it's worth compressing pages,
as I mentioned, and it's worth

compressing temporary files.

Andrei says it's just lack of time
and hands, but it's worth

implementing and proposing Pudge.

So, Yeah, it's a great idea.

I hope he will find time to work
on this and other people probably

will find time to move Postgres
forward in this area.

And in this case, imagine if temporary
files are compressed.

In this case, I would be more right
saying it's exotic to be

out of disk space when you have
temporary files occupying a lot

of disk space, right?

Michael: I think it makes extra
sense because the type of things

that are generating temporary files
like sorts and hashes, it's

probably a lot of similar values.

I'm guessing there would be a fair
amount of repetition which

would naturally suit compression
as well.

So I'm wondering if there's like
a...

Not only would it just take up
less space, and obviously there's

benefits there, but I'm wondering
if there might be performance

benefits too.

Nikolay: Yeah, that's a good point
actually.

When we talk about saving disk
space, sometimes it's also about

performance.

If disk is slow, It's better to
consume cycles of CPU for compressing,

decompressing, and things become
much faster.

I agree.

This is a good point.

But it's not always so.

It depends.

In general, I think I saw somewhere,
there's a general point

is that databases are I-O hungry.

And like, as you very well know,
we talk buffers, buffers all

the time.

It's just like confirming it's
all about I-O usually.

But I also saw CPU 100% and lag
CPU, especially for example,

we are on Graviton or like ARM
and we only have up to 64 vCPUs

and That's it.

These days it's not a lot, 64,
right?

It's like moderate number of…

Michael: It's a lot for some people,
yeah.

Nikolay: Right, right.

But it's not 360, it's not 800
like you showed me on AWS for

Intel scalable Xeon.

So if it's 64, a lot of clients,
in this case CPU becomes a very

valuable resource.

Right?

Michael: Of course.

Nikolay: Right.

So in this case, probably we might
be preferring to spend more

disk IO cycles.

Okay, what else can be compressed
in Postgres?

We talked about ideas which are
not currently implemented.

Temporary files, page level compression,
maybe protocol level,

which has pros and cons.

Maybe let's not go there, right?

What else?

Michael: Anything else in memory
that could be?

Nikolay: Oh, interesting.

Maybe.

I don't know, But I think if we,
for example, shifting maybe

to some additional projects, I
think we will talk about Hydra

and TimescaleDB.

If, for example, we have a storage
engine and we have some tables

defined as column store, this is
what Hydra provides.

In this case, it makes total sense
to compress those tables.

Or, for example, if we mentioned
last time, talking about disk

space, or we have raw store, but
we have partitioning and some

old partitions we consider as having
archived data, and this

data we want probably compress,
and maybe we want to store it

not on the regular disk, but on
object storage, and we discussed

PGT, Tempo developed.

So kind of bottomless Postgres,
but we want all data to be compressed

and rarely used.

Right?

Michael: Yeah, I saw 3 different
approaches from different extensions.

The Hydra 1 you mentioned, I think
came from Citus data originally.

C-Store FDW I think was the original.

Nikolay: It's AGPL from Citus inherited
by Hydra.

Michael: Yeah, so I think that
is very much like from the start

of a table's life, you choose whether
it should be row store

oriented or column store oriented.

That's 1 approach that makes a
lot of sense.

The timescale approach seems to
be...

Nikolay: And column store, of course,
we want to probably compress

quite a lot because the ratio usually
is good.

Michael: Well, yes.

But I got the impression that the
main aim originally with the

Citus ColumnStore approach and
therefore Hydra, was yes, you

get some compression for the storage
benefits But the main aim

seemed to be so that we can make
analytical queries faster So

again, so it's that performance
angle That was the seem to be

the driving force of why should
we want it to be column store

in the first place for performance
of analytical queries that

tend to be aggregates over a single
column.

And if we've got if we've got column
oriented data that's compressed.

Nikolay: It's massively a lot of
I.O.

Of course, we want to use this I.O.

And then already deal with it in
CPU, but also nature of data

is like a lot of similar looking
values and compression.

Michael: And I think once it's
organized in column, like by columns,

you can also start to store much
easier, like the metadata of

like min max values and do some
like shortcuts, I think on that

data as well.

So I think there's some like cool
tricks as well.

Now but But there's then I think
there's 2 other approaches that

I've seen.

1 is the Timescale approach, which
seems to be on older partitions,

like on older data.

Everything's raw store at the beginning.

And then after a certain point,
you set a policy that it gets

converted to column store later
once it's unlikely to change?

Nikolay: LBK I remember differently.

I remember it's always row store
but compression works in 2 dimensions.

Like for example, I'm not sure
if they converted to column store,

maybe I'm wrong, but what I remember
is still row store but with

understanding of vertical dimension,
so to speak.

For example, storing deltas instead
of row values and applying

compression.

Michael: Well, I mean, it's still
in Postgres.

It's still in Postgres, so it's
still row store under the hood,

but it's column-oriented, like
it's organized by column.

Like, if you look, the Postgres
docs are really good on this.

I'll share a link.

Sorry, not the Postgres, the Timescale
docs.

Nikolay: I like understanding here,
apparently.

Michael: But there's a third approach
as well.

I think that's over the Timescale
approach I think is more optimized

for space.

I think the compression is more
on the, let's make older partitions

take up less space because you
have so much data.

Like 20 times less.

Like 20, yeah, like some of the
numbers they mentioned, like

95% compression.

We have impressive

Nikolay: numbers observed, yeah.

It's really good.

Michael: But yeah, but that's the
idea is with time series data,

you could end up with hundreds
of terabytes like, and they've,

they have themselves.

So I think it's, it's the kind
of time where you could actually

save a lot.

Now, obviously they also benefit
from the performance on analytical

queries and they have some cool
features there, but it feels

like the way they've implemented
it was primarily for those storage

benefits.

And then the third one, I think,
has popped up relatively recently

in the grand scheme of things,
is this idea of, as you mentioned,

the tiering or like the moving
to a file like format like Parquet.

So exporting data out of Postgres
into a compressed format on

object storage that's normally
column oriented so that you can

get these fast analytical queries
and they take up a lot less

Nikolay: storage space.

There's some big limitations for
data types, I suspect.

Only a limited set of data types
supported for those kinds of

things.

Michael: And how do updates and
deletes work?

I don't actually know all of the
details.

Not
Nikolay: possible, I don't.

Michael: Yeah, so there's definitely
limitations and differences

between these.

A lot of, we're not gonna be able
to describe them all here,

obviously, time-wise.

But I found it really fascinating
that there's these 3 different,

quite different approaches.

What do you think?

Nikolay: Right.

Yeah, well, it's super interesting
to observe progress here.

I like, probably again, like, for
me, this episode raising more

questions than answers.

I think after this episode, I will
be planning more experiments

to study benefits.

And I guess we always need to take
into account several metrics,

compression ratio, speed of compression,
speed of decompression,

right?

Maybe the CPU overhead itself,
if it matters for us, how much

CPU we consumed.

So yeah, TimescaleDB and Hydra
are interesting in this area.

And I still, like, I'm still, I
remember my big impression reading,

from reading the TimescaleDB details
in their blog post, how

they implement compression.

I think we forgot, because of me,
we forgot to talk about compression

of dumps.

Oh yeah.

PgDump has compression options.

And also compression of wall files
as a whole, which I think

Postgres doesn't provide it, but
both PgBackRest and WAL-G, I

think the most popular backup tools,
they both do it, right?

Because if you have 16 megabytes
file, if you compress and you

have like 3 times less, for example,
as I remember, like 5 megabytes

maybe or so, maybe even less in
some cases.

Again, it depends on data.

In this case, it's much better
in terms of storage costs and

transfer speed and so on.

But this will consume some CPU
cycles and usually we archive

command is working on the primary.

This is the key here.

Usually, to avoid risks of having
longer delays, lag of archiving.

Because we need walls, it's part
of our DR strategy, the disaster

recovery strategy, so we want to
archive wall as fast as possible.

And this means we do it on the
primary.

And if we archive whole wall, we
need some CPU.

And if a lot of wall generated,
probably we will need multiple

workers.

And then you see already more than
100% of single CPUs, meaning

multiple cores are busy.

200 percent out of our 360 cores,
vCPUs, We allow 2 cores to

be used to compress and archive
walls.

It's just like some random numbers
in my head, but definitely

if we talk about compression of
whole walls by WAL-G or PgBackRest,

we need to keep in mind this, the
most valuable CPU resource,

which is on primary.

We need to think about capacity
here.

And decompressing usually is not
a big deal, especially if we

can fetch walls using multiple
workers from object storage, like

S3 and decompress them.

Usually it's not a big deal, but
still worth remembering.

So usually we have compression
there.

Unfortunately, Postgres right now
doesn't do it officially, so

this is only...

Here I talk about, again, third-party
tools which are very common,

popular, WAL-G, PgBackRest.

Others as well, I think they also
compress.

And PgDump.

PgDump is official, very official.

PgDump, PgRestore.

They support compression for custom
or directory or both formats,

right?

I always mix

Michael: them up.

Yeah, but, and this is 1 of the
oldest, like this is, this has

been in forever, right?

Nikolay: Right, but compression
was forever, but what about options

in terms of algorithms?

Oh, yeah.

ZStandard and LZ4, so I think it's
relatively new.

And it's worth again experimenting,
benchmarking, and studying.

I think there are articles from
cybertech, right?

Let's check.

Yeah, There are articles comparing
different types of compression.

and decompression.

So ratio, I think you can even
control a ratio there if you want

to spend more time and CPU capacity
to achieve a little bit more

better compression ratio, it's
possible.

Yeah, I remember some surprising
results from those articles.

I don't remember details, so let's
attach them.

But definitely we want to study
this as well.

Again, mostly questions today,
not answers, not exact recipes.

But it's good that the only thing
I don't like in PgDump, PgRestore

is that you cannot use parallelization
and compression on the

fly.

What if I don't want, this is 1
time operation, I want just logically

migrate from 1 database to another
database, and I don't want

to send a lot of bytes over the network.

I want to compress and decompress
and use multiple workers because,

for example, these servers are
not used right now.

So I just need to migrate out of
RDS to self-managed Postgres

because I feel enough level of
confidence and I've already worked

with great guys who know how to
do it and RDS is not needed for

me anymore and so on.

So, in this case, you need to,
unfortunately, you need to first

save it to disk and then restore
from there.

Yeah, so this is, I think, a big
missing feature of Postgres.

And Dimitri Fontaine created pgcopydb,
I think, but I quickly

checked before our recording, I
didn't see anything related to

compression.

It talks only about parallelization.

Let's have like 16 workers or,
I don't know, 4 workers to speed

up the process, which is in my
head quite relatively simple idea.

We just create repeatable re-transaction,
keep it, export snapshot,

create other transactions, repeatable
read transactions, and

use the same snapshot so all of
them are synchronized and you

can even read 1 huge unpartitioned
table in chunks.

It will be a fully consistent read
using multiple workers.

But no compression options.

I couldn't find it.

Maybe it's not an easy idea to
implement.

Maybe it's a good idea.

So what I would like to have is
just pgDump and pgRestore supporting

both parallelization and compression
without the need to store

intermediate file Or directory.

Would be great.

Makes sense.

Yeah, I needed it yesterday.

Copying huge, not huge, not huge
at all, like 10 or something

gigabytes tiny table with big vectors.

I mean, a lot of vectors.

By the way, let's make a full cycle
and in the end mention that

although we joked about lossy compression
using LLM, under Liontag,

when you have a lot of dimensions,
huge vector values, it means

TOAST is involved definitely.

It's interesting how it's compressed
there.

In this case, just simple idea.

Sometimes reducing the number of
dimensions means kind of lossy,

but it's also kind of compression,
right?

And OpenAI speaks about it.

It's also an interesting area,
how to like, not losing a lot

of quality.

Michael: On vectors, though, I
think Jonathan Katz made a really

good point on the episode we did
for Postgres FM, that because

it's mostly, because the vector
data, like at least the embeddings

that come back from models like
OpenAI, it's mostly random integers,

it doesn't compress well.

So there's actually, I think, I'm
not sure he's done it yet or

he's planning to do it.

There's work to look into whether
you'd actually be, it'd be

beneficial to turn compression
off for TOAST of vector data because

you're not getting much compression,
there's no point paying

the overhead of the compressing
and decompressing each point.

So I thought that was super interesting
as a use case for turning

it off.

Nikolay: Yeah, interesting.

I would like to explore it myself.

It's a very interesting area.

So I think we discussed maybe 5
different directions for benchmarking.

I would like to conduct these benchmarks.

Michael: Cool.

Let us know how you get on with
all 5 by next week, right?

Nikolay: Yeah.

Well, yeah.

So thank you.

It was interesting.

Many questions.

Michael: There you go.

Not boring after all.

Thanks so much, Nikolay.
Catch you next week.