Postgres FM

Episode Details

Transcript

Nikolay and Michael discuss time-series considerations for Postgres — including when it matters, some tips for avoiding issues, performance considerations, and more.

Here are some links to things they mentioned:

Time series data https://en.wikipedia.org/wiki/Time_series
TimescaleDB https://github.com/timescale/timescaledb
13 Tips to Improve PostgreSQL Insert Performance https://www.timescale.com/blog/13-tips-to-improve-postgresql-insert-performance
Why we're leaving the cloud (37 Signals / Basecamp / David Heinemeier Hansson) https://world.hey.com/dhh/why-we-re-leaving-the-cloud-654b47e0
UUID v7 and partitioning (“how to” by Nikolay) https://gitlab.com/postgres-ai/postgresql-consulting/postgres-howtos/-/blob/main/0065_uuid_v7_and_partitioning_timescaledb.md
pg_cron https://github.com/citusdata/pg_cron
pg_partman https://github.com/pgpartman/pg_partman
Our episode on BRIN indexes https://postgres.fm/episodes/brin-indexes
Tutorial from Citus (Andres Freund and Marco Slot) including rollups https://www.youtube.com/watch?v=0ybz6zuXCPo
IoT with PostgreSQL (talk by Chris Ellis) https://youtube.com/watch?v=KnUoDBGv4aw&t=58
pg_timeseries https://github.com/tembo-io/pg_timeseries
DuckDB https://duckdb.org

~~~

What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!

~~~

Postgres FM is produced by:

Michael Christofides, founder of pgMustard
Nikolay Samokhvalov, founder of Postgres.ai

With credit to:

Jessie Draws for the elephant artwork

Creators & Guests

Host

Michael Christofides

Founder of pgMustard

Host

Nikolay Samokhvalov

Founder of Postgres AI

What is Postgres FM?

A weekly podcast about all things PostgreSQL

Nikolay: Hello, hello.

This is Postgres.FM.

And my name is Nikolay from Postgres.AI.

And as usual, my co-host is Michael, pgMustard.

Hi, Michael.

How are you doing?

Michael: Hello, Nikolay.

I'm good.

Thank you.

How are you?

Nikolay: I'm fantastic.

And looking forward to our discussion today, I think it's really

important because, you know, some people think everything around,

everything we store in database is probably can be considered

as time series.

At least any small piece of data, any datum, right?

Was born with some creation timestamp, for example, right, so

we could consider it.

Let's discuss what time series is.

Michael: Yeah, I have heard, I've definitely heard people claiming

that all data is time series data but they tend to work for time

series databases so I think there's some correlation or some

incentive.

Nikolay: I'm not sure what's the reason, what's the consequence

here, it's not obvious.

Michael: Exactly but yeah for the purposes of this though I like

to think of time series data as much more rigidly things that

are very very useful to log at points in time and then compare

and look back at, you know, things we want to know the differences

of or like the averages of or the mins, the maxes, like that

kind of stuff feels much more useful to me to think of in these

terms, because it brings certain challenges, especially at scale.

So I think if we consider everything time series, this episode

will be very difficult.

Nikolay: Yeah, I agree with you that We can look at this like

this.

If anything can have creation or modification timestamps, but

sometimes timestamps matter a lot, like for analysis and for

operations.

Sometimes it matters less.

For example, if we want to work with some geographical data,

maps and so on, we put cities on the map, for example.

Yeah, they have a creation year, or year of creation.

Sometimes it's not clear actually.

Right.

But for day to day operations, it's not that important.

Maybe population is more important and so on.

It's depends.

Maybe position is more important.

So it's hard to say this data is time series, although it has

creation timestamp.

Each part of this has it, right?

So an opposite example is when we log something from something,

from Postgres or from systems or from drivers, I mean, cars,

which are constantly moving on the map or temperature or anything,

right?

This kind of data.

There we log many, many, many times for the same source.

We log something many, many times and we obtain data from the

same source.

For example, there is some sensor or something or some source

of, of metrics and we log these metrics every minute, for example,

in this case, it's definitely, it can be considered time series.

And in many cases, not of, not always, but in many cases, fresh

data, we, we, we wish we want to be more detailed, like ideally

every second, for example, or at least every minute And all data

we still want to have, but it can be more rough, right?

Less precise and can be already aggregated and so on.

And sometimes we can even say, this data is not relevant anymore.

If it's some logs, we don't care about logs from operations from

5 years ago, for example.

It also might happen.

It depends.

It depends on the concrete project product you are dealing with

or building, right?

So but what matters is like we obtain data periodically from

some source, and this is definitely time series to me.

Unlike cities.

Michael: I think for the purpose of this conversation, it's probably

worth distinguishing at what point we start caring, because if

we're logging every second, it only takes a few hundred, maybe

a few thousand devices before that significant scale but if we're

if even if we're logging every hour if we've got let's say a

million devices logging every hour, that's also a lot of scale.

But a lot of use cases won't have that.

So there are smaller cases where whilst they are time series,

like let's say you've got a little home setup where you are monitoring

the weather outside, maybe in the garden and maybe at the back

of the house, like maybe you've got like 2 or 3 sensors or maybe

like 4 or 5 devices logging every, even every second, say, you

might as well, you don't, probably don't need this episode, right?

You don't need to do any tuning, you don't need to do that much.

Okay, interesting.

You think

Nikolay: maybe.

It depends.

There is a big misconception for new Postgres users.

They think if table size is small, in terms of row count, it

should be very fast, no problem.

But we know how MVCC is organized in Postgres, right?

Every update means new physical
row, which we call tuple or tuple.

We're still not sure after 2 years,
almost 3 years of discussing.

Anyway, physical row version is
created on every update, unless

this update is special case, like
HOT update.

It depends.

So it's basically delete and insert
all the time.

By default, it's so, right?

It means if you update the same
row many times, with SELECT,

you still see 1 row, but physically
it might be many, many physical

versions, and you have a lot of
dead tuples, and then autovacuum

comes, which is, autovacuum is
converter of dead tuples to bloat

and you have a lot of bloat and
then you have problems with performance

although you think table is small.

Michael: I completely agree but
I think for this use case you

don't time series use case normally
is append only right normally

you're just inserting new roads
so I don't think it applies here

Nikolay: well I see updates in
time series data all the time.

Michael: Some.

Okay tell me about that because
I don't see I hear mostly about

not like

Nikolay: all right

Michael: it's append mostly let's
say

Nikolay: imagine for example CI
pipelines.

I won't tell the name of the company,
but some listeners from

that company should recognize it.

And you have pipeline and some
steps like jobs or something inside

the pipeline.

And when you create this pipeline
and then you update it a few

times, if status has changed, it
reminds queue like thing, but

it's not just queue like, you need
to log everything and present

it.

All right.

So you might create some record
and then updated status.

Yeah.

Right.

Michael: But, but let's say my,

Nikolay: my example,

Michael: Well, maybe by some definitions,
but I'm thinking more

like you mentioned logging or you
mentioned monitoring, maybe

real-time analytics, like these
kinds of things where time moves

on and we just want readings over
time, we want to graph.

Nikolay: Yeah, okay, let's agree
on this, we need to distinguish

append-only time series and time
series with some modifications

yeah

Michael: maybe but yeah really
good point though if you've got

modifications you might have to
worry about these tips and tricks

quite a bit earlier in terms of
your raw volume.

Good point.

Nikolay: Right, right.

And yeah, so and you like the problem
is not huge scale, but

I think it's a great idea to think
in advance how many records

you expect per some time, per day,
per hour, per minute, per

day, per hour, per month, per year,
and then what is your plan

to do with it and understanding
that it's approaching billion

or hundred gigabytes or something.

You need to...

Michael: By billion, you mean like
a billion rows?

Nikolay: Billions.

Billions of rows or hundreds of
gigabytes, terabyte.

For me, billion is terabyte for
our row.

Roughly, very roughly.

It depends a lot.

It can be 5X to both sides, right?

But for our row storage Postgres,
it's roughly so.

For real, for many time series
database systems, which are only

for time series, like VictoriaMetrics,
or you can consider ClickHouse

as well.

1 terabyte, 1 billion rows is
maybe 100 gigabytes, maybe 10

gigabytes even depending on compression.

Their compression is much better.

So, but this is like overall capacity
planning, right?

But we can have performance problems
locally thinking, oh, this

is a small table.

It should be like 1 millisecond
lookup.

But then you, if you, if you ignore
MVCC in Postgres, you might

end up having a lot of dead tuples
or bloat, and latency will

drop a lot as well, and that's
not good.

Michael: Cool.

Well, if we go back, if we go to
the append only scenario, what

are the main problems you tend
to see in that setup?

Nikolay: Well, for me, the main
problem is that let's pull the

elephant from our room, in the
room.

Let's recognize it.

There is a great product called
TimescaleDB.

Michael: Oh yes, yeah.

Nikolay: Absolutely great.

They have, in my opinion, 2 major
things they have is compression,

unbeatable compression for Postgres,
for time series, And continuous

aggregates, which is also good,
like kind of, it's convenient.

You can build continuous aggregates,
but it's super hard to build

good compression.

And the problem is with TimescaleDB,
it's too great, but it's

not part of core Postgres.

And it's an extension which is
available only either on Timescale

Cloud or self-hosted if you are
okay to use non-open source.

It's source available.

It's very open.

They, I know they allowed it to
use it to many, many companies,

but it's still like if you go into
build some platform, you probably

will be not allowed to use it.

And so on with compression.

I mean, not, I'm not talking about
Apache TimescaleDB, but I'm

talking about Timescale licensed
up TimescaleDB, which has compression

and continuous aggregates.

Compression is great.

Again, I recommended to read a
couple of blog posts before, I

recommend it again, they are amazing.

But it's not core of Postgres,

Michael: so...

But I actually think, I think the
list of TimescaleDB features

are a really good list of things
that are tricky with time series

data yeah like in general so they've
also done great blog posts

on things like How to copy data
into post or how to ingest data

as fast as possible So that's 1
of the 1 of the trickier things

about extreme volume and would
do maybe use A few yeah, but not

from they this was actually a tips
and tricks blog post, this

was actually like 13 tips for ingesting
and all of these types

of blog posts come out with oh
you should use Postgres copy feature,

but like there are some exceptions,
but basically if you can

get it in in a way that it's making
the most efficient use of

writing data in batches, then you
can get much much better throughput.

But then there are other tips and
tricks like not having many

indexes on the table that you're
ingesting into.

Maybe none, but probably at least
1.

Nikolay: Also foreign keys can
slow you down a lot and also check

pointer behavior.

So you need to, if you expect a
massive ride coming like 1 time,

it's worth thinking about Checkpoint,
checkpointer tuning like

for this window only accepting
risks that if we are crushed,

recovery will take longer.

So you raise checkpoint_timeout
and max_wal_size, and it should

speed up because checkpointer will
put less pressure on disk

to write more.

Michael: Yeah, nice.

Well, on that note, and I know
this is something I think we're

probably going to talk about in
the future, but because you can

set it on a, like for a single
one-off thing, you could even

consider in that case turning off
synchronous_commit for those

ingests, like if you want to...

Nikolay: I don't think it will
help with throughput at all.

Really?

Yes.

Well, how many commits per second
are we going to have?

Michael: Yeah, no, good point.

Nikolay: But also, thinking about
checkpointer tuning, I think

it depends, because in a PAN-only
specific case, probably we

won't have a lot of winning, a
lot of benefits from checkpoint

tuning because checkpoint tuning
for massive writes shines when

we have random inserts to many
pages and we revisit page.

If in append only,

Michael: the thing

Nikolay: is that we're constantly
writing to the end of data,

of heap and also B-tree index.

It's only to the right, to the
right, and it's great.

Michael: Well, that's a good point
then.

Yeah, we need a primary key that
is right.

Nikolay: Yes, that's why UUID v4
is a terrible idea, right?

Yeah, yeah, yeah.

Right.

So you want UUID v7 or just
bigint or something else which

has similar like growing...

Like ULID type thing.

So B-tree inserts will happen only
to the right, always to the right,

right and so on.

And it will be packed and so on.

But this also leads us to the question
about partitioning, right?

Because partitioning is the key.

So let me finish my, my rattling
session a little bit about Timescale

and how sad, how sad it is that
they are not part of fully open

of Postgres because I know this
is like, this is their business.

I fully respect that at the same
time.

I just have companies who come
to us asking for help.

And many of these companies, we
do health check, we improve health,

we level up.

And then like our typical client
is a startup which grows and

starts having some issues, very
often they are on some managed

Postgres service.

So backups are solved, replication
is solved, and so on.

But then they have some problems.

And It's typical that many of them
consider partitioning.

And then we realize, okay, partitioning
is needed because tables

are already large and we have simple
rule, like if you exceed

100 gigabytes, you need to do partitioning.

But then we see, oh, this data
is actually time serious, right?

And it's inevitable to consider
TimescaleDB, but they, for example,

if they are RDS or somewhere else
similar, they cannot use it.

All they can use is just to start
doing partitioning themselves,

maybe with Partman, maybe not.

Or migrate to Timescale Cloud.

And we sometimes can help them
to consider.

Michael: Or migrate to self-host,
or like, you know, self-managed

with TimescaleDB?

Nikolay: Well, it depends.

For some companies it's a good
path, but if they have a very

lean approach in terms of how much
engineers they want to have,

how much of them they want to be
involved in maintaining backups,

applications, and so on.

Like it's, it's, it's a big question.

Honestly, I expect, I fully like
agree with people like 37 Signals,

migration back to, from cloud,
right, even.

Or at least from managed service.

I think it will be happening, and
if crisis hits us and so on,

it will be happening for sure.

Many people will try to go away
from RDS because honestly I think

these days we have all the pieces
already battle proven to have

cluster working with backups and
replication, autofailover, everything

like that.

It may be Kubernetes, maybe not
Kubernetes, doesn't matter actually.

We have many pieces already battle
proven and we can live without

RDS or any managed service and
spend not a lot of efforts to

maintain this.

But still, there is also business
for Timescale.

They want to have paid customers.

I, again, I respect that.

And sometimes we have to consider
moving there, but if you move

there, also complications because
Timescale Cloud has limitations,

like just single logical database,
right?

It's kind of, in some cases it's
not convenient to have to, like

you, you have a cluster, but you
cannot run additional create

database and have multiple logical
databases inside.

This is limitation of Timescale
Cloud.

And anyway, we, we, it's like middle
age in Europe.

We have many, many, I don't know,
like dukes or so, and many

counties, right?

Counties.

Yeah.

It's called counties in your country
and in our country as well.

Like I'm living in San Diego County.

So yeah, this is, this is like
fragmented space.

Right.

And, and if you want good TimescaleDB,
you want, you need to

go either there or here, you cannot
do it on any other managed

service.

It's, it's a bad, I don't understand
why they cannot reach some

agreement and have some, some business
agreement.

I mean, AWS and Timescale company
and Google and Azure, I mean,

Microsoft and so on, it would be
good for all.

Right, it's ridiculous.

Because I'm telling everyone, like,
compression is great.

Read these posts, you understand,
just make experiments, it's

so great, right?

This is like, Without it, my advice,
okay, maybe not Postgres

for this kind of data, maybe ClickHouse.

ClickHouse was created to handle
the load for Google Analytics-like

load.

You ingest a lot of logs into it,
a lot, a lot of time series

data and store and so on.

While TimescaleDB shines and compression
because they apply

like 2 dimensional compression,
they also compress column wise,

right?

That's great.

And for, if you're, if you have
metrics from some sensor coming

to you at every second, for example,
they cannot change quickly

a lot, right?

So they change a little, change
like a temperature or position,

right?

Position of some driver.

And it means that we can convert
it to deltas and then additionally

compress and additionally compress
and TimescaleDB reaches like

30x compression for example.

I remember some case.

It's great without it Wow, it's
a lot of data, And it's not only

storage, but it's also memory,
buffer pool, WAL,

Michael: WAL.

OK, yeah.

Let me pull us back, because I
think you're right at a certain

scale.

Even you care about pure disk size,
it really, really matters.

But I think there's a lot of use
cases, I'd say most use cases

that I see are smaller.

Like, there are a lot of time series
use cases that are much,

that are, like, in that middle
ground, right?

Like, they're big enough to be
considering partitioning some

certain like maintenance optimizations
some set like some really

considerate schema design but they're
not but they still manageable

within Postgres core Postgres
quite healthily If

Nikolay: you have a project which
you predict won't explode in

terms of growth, it's okay.

Michael: Yeah, or it's like growing
not exponentially, but like

steadily.

Nikolay: If you expect you won't
have more than like 10 terabytes

of data in the next 10 years.

Oh, it's okay.

Maybe 5 years.

It's okay to implement partitioning
yourself and maybe aggregating

all data and repacking it somehow,
for example.

Partitioning is great because it's
many things.

Like it's, it's gives you ability
to, for example, if you have

partition, which is like one day
from previous month, okay, it's

time to repack it and store only
aggregates, rough, rough data,

per hour, for example, not per
second anymore.

It's a huge reduction.

And if you don't have partitioning,
it will be nightmare to deal

with updates or deletes and then
MVCC again right a vacuum will

come and you have empty space and
then some insert happens to

like from new new insert will go
there and now this page is out

of visibility map and it's not
all visible it's not all frozen

so what the vacuum needs to chew
it again it's like if we have

partitioning we have locality and
by the way it's a like I talked

to some someone recently and we
discussed partitioning, custom

partitioning without TimescaleDB and discussion was, should

it be partitioned by customer or
project or time-wise.

Of course, time-wise, if we talk
about this kind of data, it

must involve time consideration
into partitioning because in

this case, you will have fresh
partition where all inserts coming

to one partition, right?

And then you have archived data
and you can deal with it separately.

So you can repack partition, detach
it, detach old partition,

attach new partition, whatever
you want, fully online.

And autovacuum will be fine and
no new inserts will come to

this partition, right?

Unless you support some occasional
updates of old data.

Even then, if it's not massive,
those updates, it's still good.

You can do that, you can repack
by updating basically.

You have a second resolution partition
for all the day from the

previous month, and then you convert
it to, for example, hour

resolution.

3600 times smaller.

It's great.

And actually, this is a great recipe.

This is not compression, but it's
like kind of replacement for

alternative, right?

And then you have-

Michael: I hear this get called
roll-up tables quite often, which

I really like.

And you could have multiple granularities,
like you could have

hourly, but you could also have
daily, monthly, like depending

on how fast you need those queries
to be or what you're using

them for, what your dashboards
are showing or what customers

need to see.

You can have a few different ones
and each one is going to be an

order or orders of magnitude smaller
than the previous one and

therefore faster to query.

Exactly.

Yeah.

So I think this is a really good
solution and the other thing

about partitioning I'm not sure
you mentioned yet is is not just

kind of rolling up but you but
also just purely dropping the

Nikolay: yeah yeah yeah that's

Michael: getting rid of that

Nikolay: solution yeah or well
all there is also an attempt to

have tiered storage.

So maybe all partitions should
go to S3 or GCS or some blob storage

on Azure, how it's called, they're
still trying to remember.

Anyway, like object storage and
where we can choose the level

like S3 Glacier, right?

It's, it's cheaper.

It's slower, but people rarely
access this data and so on.

Yeah, there are several options
here.

How you need to, you need some
strategy for data management and

long, longer term, right?

For this data.

I see how companies struggle when
they're already hitting some

limits and they need to do something
about it.

But it's a lot of effort to redesign
things.

If you have earlier, like, think
earlier, maybe it will save

some.

It's harder in the beginning, as
usual.

More like, bigger challenge to
solve.

But then it

Michael: pays off.

Yeah.

I think this is such a difficult
1.

Like, you've got all of the traps.

You've got traps in both directions,
right?

You've got all of the premature
optimization traps, but then

also...

I don't...

Nikolay: Well, yeah.

If it was...

Or just use TimescaleDB, that's
it, right?

Michael: So I do think there is
an argument for it, and not just

because of the scalability, but
I think, like, developer experience-wise,

you don't have to...

Like, even with pg_partman, there's
a bit of setup.

Like, with continuous aggregates,
there's quite a lot of setup

for you if you want to

Nikolay: do that.

But this is available pg_cron and
pg_partman are there are available

everywhere.

Michael: I completely agree I completely
agree I'm just saying

from a UX point of view, like developer
experience point of view,

there's also a bunch of functions
that come with TimescaleDB

that are really useful, make certain
things much simpler to write

queries for.

So I do think there's an argument
for that but I do think you

are then trading off freedom like
there are then only so many

managed services you can use and
I think even there was a recent

alternative that stuck that popped
up from the team at Tembo,

I don't know if you, I haven't
had a chance

Nikolay: to look at

Michael: it properly yet, but pg_timeseries?

Nikolay: Ah no, this I haven't
tried.

Michael: Yeah it's new and they
left the exact same reason in

the readme, you can see the exact
reason they've done it is because

they could only have used Apache
TimescaleDB and they realized

they needed something a bit more
than that.

Nikolay: And then additional complexities
hit you when you grow

and at some scale schema changes
become a problem.

Right, For partition tables they
are much more harder, they're

not easy to do at scale anyway.

But with partition tables...

Michael: But this is, this was kind of my point, that you're

paying some complexity up front for doing this and it is kind

of a premature optimization to partition a table that is absolutely

tiny and yes it will be a pain to move from a single table to

a partitioned table later but you've also got away with some simplicity

for maybe a year or 2 maybe longer

Nikolay: well anyway I think if you have partition if you have

timescale time-serious situation yeah and you expect some terabytes

of data in a few years, I would implement partitioning right

away anyway, because this would avoid me of using some queries

which are not compatible with partitioning.

And these query is really hard to redesign later.

It's better to have some partitioning right away and partitioning

key should be involving timestamp.

And that's great.

I mean, yes, it's some inconvenience in the beginning, but it's

already like, it's not rocket science at all these days, like

many people done it and there are blog posts.

I have some how to recipe, how to partition a table with

UUID v7.

And I use, it has like example before TimescaleDB, but it's

agnostic to version of Postgres, so it can be without TimescaleDB, same thing.

And yeah, I think partitioning

is a must if you expect terabytes

of data, especially if dozens of terabytes become absolutely

must in the beginning.

And it's a pity that we don't have, like do you think partitioning

could be improved in Postgres?

In which areas?

Michael: Well, to be fair, it has been improving.

Like, there's so...

Each release, every release, there's quite significant improvement.

Well, in

Nikolay: Postgres 10 we got declarative partitioning, right?

Yes.

And then it was improving only, like, evolution-wise, like, polishing

things.

Michael: Okay.

Nikolay: Do you think UX wise big step could be done, UX wise

to simplify?

Like for example, I remember we are helping very famous company

to partition like during a weekend.

Like I think it was more than 1 year ago.

It was a AI company, super popular.

We helped them and it was great, but it was interesting experience

actually to help like to do it very quickly without downtime

and so on.

Michael: Without downtime is the hard part.

Nikolay: Yeah, but for me it's
easy because I know the, like

for me, not so easy, but I know
how to do it and how to verify

it before deployment and so on.

Hold my focus on that part.

What we missed, I'll tell you the
story, it's funny.

I was washing my car and I was
chatting to my friend also from

from a Bay Area, like I told him,
you know, we helped this company

partition, to implement partitioning.

He's, ah, because he mentioned
he's using the company.

He's using products of the company.

Okay.

He's using product.

I mentioned, okay, we just implement,
help them implement partitioning.

And he said, Oh, the service is
down right now.

I said, really?

And I'm looking at my watch and
I see 5 p.m.

And I realized, 5 p.m.

Probably this is the time, it's
like 5-0-0, right?

It's the time probably when new
partition must be created.

And this we completely missed.

Creation of new partitions.

Michael: Wow.

Nikolay: It's like we focused on
hard topics and failed in

Michael: simple topics.

Nikolay: You must create partition
in advance.

And this is on shoulders of developer.

That's not all right, right?

We need pg_cron or something.

Why is it like...

DX, developer experience, is not
excellent, unfortunately.

I think some things can be improved
in this area.

But of course it could be good
to do it evolutionarily as well.

Like maybe pg_cron should be part
of Postgres, first thing.

Cool.

Yeah, maybe.

At least, at least, contrib module
or something, I don't know.

And then probably partition creation
could be part of declarative

syntax, you define it when you
create your partition schema,

you say, okay, new partitions should
be created hourly or daily

or something.

Yeah.

Right?

Right now it's terrible, everything
on main.

I mean, it's not terrible, it's
much better when I implemented

partitioning for my first RDS project
and it was inheritance-based.

I did it with 0 downtime, it was
great, but I spent like a couple

of months.

Yeah, this is time when I started
understanding that clones are

great because you can experiment
and check everything.

Michael: Going back, I think what
I meant when I said no downtime

is the hard part, I mean new features
for partitioning generally

come in and there's so many kind
of like foot guns of what you

can do without a heavy lock and
what you can what needs

Nikolay: 1 keys

Michael: like even dropping an
index yeah so there are so many

tricky things that get implemented
first in a way that has heavy

locks and then later in a way that
can be done without, so therefore

in a 0 downtime fashion.

So I do see it as probably it's
going to only improve incrementally,

but I don't see why automatic creation
couldn't be a complete

thing.

It could also help, yeah, remember
we were talking about pg_squeeze

and that's another feature that
even though I think it was vacuum

full concurrently could come in.

With glass

Nikolay: option maybe, right?

Michael: Yeah but the other feature
it has is scheduled, let's

say like repacking.

Repacking.

Or whatever, squeezing.

So scheduled squeezing.

Well, it's different.

It would need the same kind of
logic though, right?

It would need the same, I guess
that's trigger based instead

of time based, but it's a similar
kind of Background worker type

thing.

Nikolay: Oh, yeah, maybe you're
right.

Actually.

Yeah.

I see evolution here.

First of all, I fully agree, this
whole class of problems that

are gradually solved in every major
release related to locks,

partitioning versus locking, right?

Indexes, foreign keys, detaching,
attaching, we remember every

release like last 10 years maybe.

When the 10 version was released
it was 2017 I guess.

Michael: I like to do, 17 is the
most recent so it's 7 years

ago.

Nikolay: Yes, I'm old guy, I still
remember that 95 was 15, 96

was 16, okay.

So I think you're right, and some
steps could be done here, and

experience could be improved, And
I mean in smaller steps.

And yeah, but you're talking about
repacking for to mitigate

bloat.

Previously, we talked about repacking
to make partition like

100 times smaller and having more
or less precise data, different

precision level of precision.

I think this the latter should
be on schedule.

And to be on schedule right now,
there is no option in Postgres.

You need pg_cron, right?

That's why I think pg_cron could
help to open some new features

for partitioning if it was in the
core.

Yeah, maybe we have gone sideways
a little bit from time series,

but time series and partitioning
they are very strongly connected

in my head, in my mind.

Michael: Yeah, I completely agree.

I think you mentioned Timescale
has a couple of important features,

but I would add the automatic partitioning
as another thing.

Nikolay: Oh, it's great, Yeah,
yeah.

It's just, it's very declarative.

You just say I want every hour
and then you just forget about

it.

It works really great.

By the way, also, if we, for TimescaleDB, we don't, we're not

scared to have a lot of partitions.

They're called chunks, right?

Yeah.

And it's interesting that if you
have time series, you want partitioning,

you should explore what will happen
if you have thousands of

partitions, for example.

And this provokes you to use pooling
and establish connection

less often.

Remember, I have a blog post and
mistake was made that real cache,

real cache related mistake.

So every time Postgres connects
again, first query will be slow

if you have a lot of partitions.

But subsequent queries will be
fast.

So if you have many, many partitions,
you should have connection

pooling and establish connection
not often.

Or somehow accept that the planner
needs more time for the very

first planning.

Michael: I guess it depends on
what queries, but yeah, it seems

to me like that would be...

It doesn't depend

Nikolay: on queries.

Because it's like real cache, we
need to load the metadata for

all partitions anyway.

Michael: Yeah, but let's say they're
analytical queries.

It won't matter a few milliseconds
of planning.

Nikolay: Analytical queries, time series, without TimescaleDB,

please go out.

It's not Postgres case.

Well, unless it's a small project.

If it's a small project fine but
if it's not small let's not

fool ourselves.

It's not

Michael: for Postgres.

What about the

Nikolay: roll up tables?

Michael: The roll-up tables are
exactly for that case.

Nikolay: Well yeah if you do a
very strict strategy and very

aggressive roll-up and three-stage
approach and maybe your counts

and sums and means, maxes, aggregates
will be fast, I don't know.

Yeah.

But it's easy to experiment, actually.

It's a one-day project to experiment
and check for your particular

case, I mean, to explore what latencies
will be and so on.

This is definitely worth it if
you plan some new thing to explore

latencies.

Michael: Yes, yes, yes.

Nikolay: Okay, okay.

I actually, I described this roll-up
approach not knowing it's

called roll-up myself, and then
I'm still pulling us out of Postgres.

But, well, Postgres is great.

I know when you go dig deeper,
you see some problems, you talk

about them, and then people come
to you and say you are Postgres

hater.

It happened to me not once.

Yes that's why I spent like 20
years and still not stopping working

with Postgres and thinking and so
on.

Yeah I'm Postgres hater.

Michael: Let's do, should we do
a quick fire, kind of a few last

things in terms of considerations
while we're here?

I think,

Nikolay: well, we didn't talk about
updates, unfortunately.

Yes, but I think if updates are
in place, definitely first thing

I would try to localize with them
with partitioning again, partitioning

localization.

My new fresh idea, not mine, I
actually borrowed it from a good

guy who is expert longer than me.

So idea is that if you have localized
writes, thanks to partitioning,

it also helps to backups, incremental
backups, because you don't

touch old pages.

So this makes incremental backup
lighter, smaller, right?

And recovery as well.

It's also good.

It's not about storage only.

It's not only about the buffer
pool and WAL rates, full-page

writes and so on.

It's also about backup.

So everything is connected here.

It's great to have partitioning
for large tables exceeding like

100 gigabytes.

So this is first thing, compression,
any kinds of, or this roll-up

idea, this is second 1, right?

Yeah, a couple of

Michael: things we haven't talked
a lot about because of the

volume here I think disk space
can be a big thing I know compression

really helps there but another
thing that can help is if you

do have any secondary indexes like
not not the primary key if

you do want another index, maybe
consider BRIN as well, especially

in the append-only case, but even,
we had a whole episode about

it, but with the new operator classes,
with min-max multi, it's

really quite powerful and much
much smaller than B-tree.

Nikolay: That's a

Michael: great point.

Nikolay: Do you remember the version
when it was added?

16, 17, 15?

Michael: I think it was roughly
when we started, maybe like a

year or 2 before we started the
podcast.

So maybe, yeah.

Nikolay: Anyway, in very fresh
versions.

Yeah, and also continuous aggregates
in TimescaleDB is a great

thing.

If you don't have TimescaleDB,
then something should be done,

like probably incremental materialized
views and so on.

Michael: Yeah, there's that extension.

But also there's a really great
talk by the team at Citus on

roll-ups.

So I will share that in the show
notes.

There's also on this topic there's
with a lot more practical

tips.

There's a talk by a fellow Brit
of mine, Chris Ellis.

He's done a talk on, it was about
Internet of Things, so sensor

data, but it's basically just time
series, right?

So I'll share a link to that talk
as well.

Nikolay: 4 points.

What's the last 1?

Number 5.

Michael: What about language stuff?

Like, Chris, in his talk, I remember,
talked quite a bit about

window functions and interpolating
gaps.

There was probably some considerations
there that you probably

still want access to SQL.

Postgres is so powerful there that
I think some of the other

specialist databases, it's slightly
harder to do some of that

stuff or it's a lot more it's a
lot more learning.

Nikolay: My favorite function is
lag.

Yeah.

I deal with A lot with lags in
various kinds of replication.

And here we have a window function
called lag, right?

I thought you were going to

Michael: say your internet connection,
but yeah.

Nikolay: Well, this is as well.

Lags everywhere.

Anyway, let's consider this number
5.

Postgres has great SQL, standard
SQL support, window functions,

a very rich set of them, and here
it can help as well.

Michael: Yeah, last thing, we have
mentioned Timescale a lot,

but there are other extensions
like Citus, Hydra as well, worth

considering for specific use cases.

And pg_timeseries, I'm gonna check
it out.

I'm not sure quite what they've
got in terms of features yet

but it's Postgres licensed.

Nikolay: And pg_duckdb episode which
is new thing we had pg_duckdb

episode.

Michael: Yeah but I haven't considered
it for time series data.

Nikolay: Do it!

Yeah, why not?

Well, all things are fresh.

I just saw how my team member used
DuckDB just locally to read

some CSV logs, snapshots actually
collected from Postgres.

It was great, like it was some
SQL, DuckDB SQL, and super

convenient as well.

So I think we have a lot of things
and tools which are open source

and that's great.

Michael: 1 more idea.

Even the topic from last time was,
or 2 times ago, was pg_dog,

like sharding.

If you're getting to extreme, light,
heavy workloads, you could

shard by, you could have time series
partitioning on every shard,

but then maybe a range of device
IDs for each 1 and shard based

on something other key.

Nikolay: Right, yeah.

Thank you for this little brainstorm.

It was a pleasure to talk to you
as usual.

Michael: Likewise.

Time-series considerations

Time-series considerationsTime-series considerations

More episodes

Time-series considerations

Time-series considerations

Chapters

Creators & Guests

What is Postgres FM?