Nikolay: Hello, hello.
This is Postgres.FM.
And my name is Nikolay from Postgres.AI.
And as usual, my co-host is Michael, pgMustard.
Hi, Michael.
How are you doing?
Michael: Hello, Nikolay.
I'm good.
Thank you.
How are you?
Nikolay: I'm fantastic.
And looking forward to our discussion today, I think it's really
important because, you know, some people think everything around,
everything we store in database is probably can be considered
as time series.
At least any small piece of data, any datum, right?
Was born with some creation timestamp, for example, right, so
we could consider it.
Let's discuss what time series is.
Michael: Yeah, I have heard, I've definitely heard people claiming
that all data is time series data but they tend to work for time
series databases so I think there's some correlation or some
incentive.
Nikolay: I'm not sure what's the reason, what's the consequence
here, it's not obvious.
Michael: Exactly but yeah for the purposes of this though I like
to think of time series data as much more rigidly things that
are very very useful to log at points in time and then compare
and look back at, you know, things we want to know the differences
of or like the averages of or the mins, the maxes, like that
kind of stuff feels much more useful to me to think of in these
terms, because it brings certain challenges, especially at scale.
So I think if we consider everything time series, this episode
will be very difficult.
Nikolay: Yeah, I agree with you that We can look at this like
this.
If anything can have creation or modification timestamps, but
sometimes timestamps matter a lot, like for analysis and for
operations.
Sometimes it matters less.
For example, if we want to work with some geographical data,
maps and so on, we put cities on the map, for example.
Yeah, they have a creation year, or year of creation.
Sometimes it's not clear actually.
Right.
But for day to day operations, it's not that important.
Maybe population is more important and so on.
It's depends.
Maybe position is more important.
So it's hard to say this data is time series, although it has
creation timestamp.
Each part of this has it, right?
So an opposite example is when we log something from something,
from Postgres or from systems or from drivers, I mean, cars,
which are constantly moving on the map or temperature or anything,
right?
This kind of data.
There we log many, many, many times for the same source.
We log something many, many times and we obtain data from the
same source.
For example, there is some sensor or something or some source
of, of metrics and we log these metrics every minute, for example,
in this case, it's definitely, it can be considered time series.
And in many cases, not of, not always, but in many cases, fresh
data, we, we, we wish we want to be more detailed, like ideally
every second, for example, or at least every minute And all data
we still want to have, but it can be more rough, right?
Less precise and can be already aggregated and so on.
And sometimes we can even say, this data is not relevant anymore.
If it's some logs, we don't care about logs from operations from
5 years ago, for example.
It also might happen.
It depends.
It depends on the concrete project product you are dealing with
or building, right?
So but what matters is like we obtain data periodically from
some source, and this is definitely time series to me.
Unlike cities.
Michael: I think for the purpose of this conversation, it's probably
worth distinguishing at what point we start caring, because if
we're logging every second, it only takes a few hundred, maybe
a few thousand devices before that significant scale but if we're
if even if we're logging every hour if we've got let's say a
million devices logging every hour, that's also a lot of scale.
But a lot of use cases won't have that.
So there are smaller cases where whilst they are time series,
like let's say you've got a little home setup where you are monitoring
the weather outside, maybe in the garden and maybe at the back
of the house, like maybe you've got like 2 or 3 sensors or maybe
like 4 or 5 devices logging every, even every second, say, you
might as well, you don't, probably don't need this episode, right?
You don't need to do any tuning, you don't need to do that much.
Okay, interesting.
You think
Nikolay: maybe.
It depends.
There is a big misconception for new Postgres users.
They think if table size is small, in terms of row count, it
should be very fast, no problem.
But we know how MVCC is organized in Postgres, right?
Every update means new physical
row, which we call tuple or tuple.
We're still not sure after 2 years,
almost 3 years of discussing.
Anyway, physical row version is
created on every update, unless
this update is special case, like
HOT update.
It depends.
So it's basically delete and insert
all the time.
By default, it's so, right?
It means if you update the same
row many times, with SELECT,
you still see 1 row, but physically
it might be many, many physical
versions, and you have a lot of
dead tuples, and then autovacuum
comes, which is, autovacuum is
converter of dead tuples to bloat
and you have a lot of bloat and
then you have problems with performance
although you think table is small.
Michael: I completely agree but
I think for this use case you
don't time series use case normally
is append only right normally
you're just inserting new roads
so I don't think it applies here
Nikolay: well I see updates in
time series data all the time.
Michael: Some.
Okay tell me about that because
I don't see I hear mostly about
not like
Nikolay: all right
Michael: it's append mostly let's
say
Nikolay: imagine for example CI
pipelines.
I won't tell the name of the company,
but some listeners from
that company should recognize it.
And you have pipeline and some
steps like jobs or something inside
the pipeline.
And when you create this pipeline
and then you update it a few
times, if status has changed, it
reminds queue like thing, but
it's not just queue like, you need
to log everything and present
it.
All right.
So you might create some record
and then updated status.
Yeah.
Right.
Michael: But, but let's say my,
Nikolay: my example,
Michael: Well, maybe by some definitions,
but I'm thinking more
like you mentioned logging or you
mentioned monitoring, maybe
real-time analytics, like these
kinds of things where time moves
on and we just want readings over
time, we want to graph.
Nikolay: Yeah, okay, let's agree
on this, we need to distinguish
append-only time series and time
series with some modifications
yeah
Michael: maybe but yeah really
good point though if you've got
modifications you might have to
worry about these tips and tricks
quite a bit earlier in terms of
your raw volume.
Good point.
Nikolay: Right, right.
And yeah, so and you like the problem
is not huge scale, but
I think it's a great idea to think
in advance how many records
you expect per some time, per day,
per hour, per minute, per
day, per hour, per month, per year,
and then what is your plan
to do with it and understanding
that it's approaching billion
or hundred gigabytes or something.
You need to...
Michael: By billion, you mean like
a billion rows?
Nikolay: Billions.
Billions of rows or hundreds of
gigabytes, terabyte.
For me, billion is terabyte for
our row.
Roughly, very roughly.
It depends a lot.
It can be 5X to both sides, right?
But for our row storage Postgres,
it's roughly so.
For real, for many time series
database systems, which are only
for time series, like VictoriaMetrics,
or you can consider ClickHouse
as well.
1 terabyte, 1 billion rows is
maybe 100 gigabytes, maybe 10
gigabytes even depending on compression.
Their compression is much better.
So, but this is like overall capacity
planning, right?
But we can have performance problems
locally thinking, oh, this
is a small table.
It should be like 1 millisecond
lookup.
But then you, if you, if you ignore
MVCC in Postgres, you might
end up having a lot of dead tuples
or bloat, and latency will
drop a lot as well, and that's
not good.
Michael: Cool.
Well, if we go back, if we go to
the append only scenario, what
are the main problems you tend
to see in that setup?
Nikolay: Well, for me, the main
problem is that let's pull the
elephant from our room, in the
room.
Let's recognize it.
There is a great product called
TimescaleDB.
Michael: Oh yes, yeah.
Nikolay: Absolutely great.
They have, in my opinion, 2 major
things they have is compression,
unbeatable compression for Postgres,
for time series, And continuous
aggregates, which is also good,
like kind of, it's convenient.
You can build continuous aggregates,
but it's super hard to build
good compression.
And the problem is with TimescaleDB,
it's too great, but it's
not part of core Postgres.
And it's an extension which is
available only either on Timescale
Cloud or self-hosted if you are
okay to use non-open source.
It's source available.
It's very open.
They, I know they allowed it to
use it to many, many companies,
but it's still like if you go into
build some platform, you probably
will be not allowed to use it.
And so on with compression.
I mean, not, I'm not talking about
Apache TimescaleDB, but I'm
talking about Timescale licensed
up TimescaleDB, which has compression
and continuous aggregates.
Compression is great.
Again, I recommended to read a
couple of blog posts before, I
recommend it again, they are amazing.
But it's not core of Postgres,
Michael: so...
But I actually think, I think the
list of TimescaleDB features
are a really good list of things
that are tricky with time series
data yeah like in general so they've
also done great blog posts
on things like How to copy data
into post or how to ingest data
as fast as possible So that's 1
of the 1 of the trickier things
about extreme volume and would
do maybe use A few yeah, but not
from they this was actually a tips
and tricks blog post, this
was actually like 13 tips for ingesting
and all of these types
of blog posts come out with oh
you should use Postgres copy feature,
but like there are some exceptions,
but basically if you can
get it in in a way that it's making
the most efficient use of
writing data in batches, then you
can get much much better throughput.
But then there are other tips and
tricks like not having many
indexes on the table that you're
ingesting into.
Maybe none, but probably at least
1.
Nikolay: Also foreign keys can
slow you down a lot and also check
pointer behavior.
So you need to, if you expect a
massive ride coming like 1 time,
it's worth thinking about Checkpoint,
checkpointer tuning like
for this window only accepting
risks that if we are crushed,
recovery will take longer.
So you raise checkpoint_timeout
and max_wal_size, and it should
speed up because checkpointer will
put less pressure on disk
to write more.
Michael: Yeah, nice.
Well, on that note, and I know
this is something I think we're
probably going to talk about in
the future, but because you can
set it on a, like for a single
one-off thing, you could even
consider in that case turning off
synchronous_commit for those
ingests, like if you want to...
Nikolay: I don't think it will
help with throughput at all.
Really?
Yes.
Well, how many commits per second
are we going to have?
Michael: Yeah, no, good point.
Nikolay: But also, thinking about
checkpointer tuning, I think
it depends, because in a PAN-only
specific case, probably we
won't have a lot of winning, a
lot of benefits from checkpoint
tuning because checkpoint tuning
for massive writes shines when
we have random inserts to many
pages and we revisit page.
If in append only,
Michael: the thing
Nikolay: is that we're constantly
writing to the end of data,
of heap and also B-tree index.
It's only to the right, to the
right, and it's great.
Michael: Well, that's a good point
then.
Yeah, we need a primary key that
is right.
Nikolay: Yes, that's why UUID v4
is a terrible idea, right?
Yeah, yeah, yeah.
Right.
So you want UUID v7 or just
bigint or something else which
has similar like growing...
Like ULID type thing.
So B-tree inserts will happen only
to the right, always to the right,
right and so on.
And it will be packed and so on.
But this also leads us to the question
about partitioning, right?
Because partitioning is the key.
So let me finish my, my rattling
session a little bit about Timescale
and how sad, how sad it is that
they are not part of fully open
of Postgres because I know this
is like, this is their business.
I fully respect that at the same
time.
I just have companies who come
to us asking for help.
And many of these companies, we
do health check, we improve health,
we level up.
And then like our typical client
is a startup which grows and
starts having some issues, very
often they are on some managed
Postgres service.
So backups are solved, replication
is solved, and so on.
But then they have some problems.
And It's typical that many of them
consider partitioning.
And then we realize, okay, partitioning
is needed because tables
are already large and we have simple
rule, like if you exceed
100 gigabytes, you need to do partitioning.
But then we see, oh, this data
is actually time serious, right?
And it's inevitable to consider
TimescaleDB, but they, for example,
if they are RDS or somewhere else
similar, they cannot use it.
All they can use is just to start
doing partitioning themselves,
maybe with Partman, maybe not.
Or migrate to Timescale Cloud.
And we sometimes can help them
to consider.
Michael: Or migrate to self-host,
or like, you know, self-managed
with TimescaleDB?
Nikolay: Well, it depends.
For some companies it's a good
path, but if they have a very
lean approach in terms of how much
engineers they want to have,
how much of them they want to be
involved in maintaining backups,
applications, and so on.
Like it's, it's, it's a big question.
Honestly, I expect, I fully like
agree with people like 37 Signals,
migration back to, from cloud,
right, even.
Or at least from managed service.
I think it will be happening, and
if crisis hits us and so on,
it will be happening for sure.
Many people will try to go away
from RDS because honestly I think
these days we have all the pieces
already battle proven to have
cluster working with backups and
replication, autofailover, everything
like that.
It may be Kubernetes, maybe not
Kubernetes, doesn't matter actually.
We have many pieces already battle
proven and we can live without
RDS or any managed service and
spend not a lot of efforts to
maintain this.
But still, there is also business
for Timescale.
They want to have paid customers.
I, again, I respect that.
And sometimes we have to consider
moving there, but if you move
there, also complications because
Timescale Cloud has limitations,
like just single logical database,
right?
It's kind of, in some cases it's
not convenient to have to, like
you, you have a cluster, but you
cannot run additional create
database and have multiple logical
databases inside.
This is limitation of Timescale
Cloud.
And anyway, we, we, it's like middle
age in Europe.
We have many, many, I don't know,
like dukes or so, and many
counties, right?
Counties.
Yeah.
It's called counties in your country
and in our country as well.
Like I'm living in San Diego County.
So yeah, this is, this is like
fragmented space.
Right.
And, and if you want good TimescaleDB,
you want, you need to
go either there or here, you cannot
do it on any other managed
service.
It's, it's a bad, I don't understand
why they cannot reach some
agreement and have some, some business
agreement.
I mean, AWS and Timescale company
and Google and Azure, I mean,
Microsoft and so on, it would be
good for all.
Right, it's ridiculous.
Because I'm telling everyone, like,
compression is great.
Read these posts, you understand,
just make experiments, it's
so great, right?
This is like, Without it, my advice,
okay, maybe not Postgres
for this kind of data, maybe ClickHouse.
ClickHouse was created to handle
the load for Google Analytics-like
load.
You ingest a lot of logs into it,
a lot, a lot of time series
data and store and so on.
While TimescaleDB shines and compression
because they apply
like 2 dimensional compression,
they also compress column wise,
right?
That's great.
And for, if you're, if you have
metrics from some sensor coming
to you at every second, for example,
they cannot change quickly
a lot, right?
So they change a little, change
like a temperature or position,
right?
Position of some driver.
And it means that we can convert
it to deltas and then additionally
compress and additionally compress
and TimescaleDB reaches like
30x compression for example.
I remember some case.
It's great without it Wow, it's
a lot of data, And it's not only
storage, but it's also memory,
buffer pool, WAL,
Michael: WAL.
OK, yeah.
Let me pull us back, because I
think you're right at a certain
scale.
Even you care about pure disk size,
it really, really matters.
But I think there's a lot of use
cases, I'd say most use cases
that I see are smaller.
Like, there are a lot of time series
use cases that are much,
that are, like, in that middle
ground, right?
Like, they're big enough to be
considering partitioning some
certain like maintenance optimizations
some set like some really
considerate schema design but they're
not but they still manageable
within Postgres core Postgres
quite healthily If
Nikolay: you have a project which
you predict won't explode in
terms of growth, it's okay.
Michael: Yeah, or it's like growing
not exponentially, but like
steadily.
Nikolay: If you expect you won't
have more than like 10 terabytes
of data in the next 10 years.
Oh, it's okay.
Maybe 5 years.
It's okay to implement partitioning
yourself and maybe aggregating
all data and repacking it somehow,
for example.
Partitioning is great because it's
many things.
Like it's, it's gives you ability
to, for example, if you have
partition, which is like one day
from previous month, okay, it's
time to repack it and store only
aggregates, rough, rough data,
per hour, for example, not per
second anymore.
It's a huge reduction.
And if you don't have partitioning,
it will be nightmare to deal
with updates or deletes and then
MVCC again right a vacuum will
come and you have empty space and
then some insert happens to
like from new new insert will go
there and now this page is out
of visibility map and it's not
all visible it's not all frozen
so what the vacuum needs to chew
it again it's like if we have
partitioning we have locality and
by the way it's a like I talked
to some someone recently and we
discussed partitioning, custom
partitioning without TimescaleDB and discussion was, should
it be partitioned by customer or
project or time-wise.
Of course, time-wise, if we talk
about this kind of data, it
must involve time consideration
into partitioning because in
this case, you will have fresh
partition where all inserts coming
to one partition, right?
And then you have archived data
and you can deal with it separately.
So you can repack partition, detach
it, detach old partition,
attach new partition, whatever
you want, fully online.
And autovacuum will be fine and
no new inserts will come to
this partition, right?
Unless you support some occasional
updates of old data.
Even then, if it's not massive,
those updates, it's still good.
You can do that, you can repack
by updating basically.
You have a second resolution partition
for all the day from the
previous month, and then you convert
it to, for example, hour
resolution.
3600 times smaller.
It's great.
And actually, this is a great recipe.
This is not compression, but it's
like kind of replacement for
alternative, right?
And then you have-
Michael: I hear this get called
roll-up tables quite often, which
I really like.
And you could have multiple granularities,
like you could have
hourly, but you could also have
daily, monthly, like depending
on how fast you need those queries
to be or what you're using
them for, what your dashboards
are showing or what customers
need to see.
You can have a few different ones
and each one is going to be an
order or orders of magnitude smaller
than the previous one and
therefore faster to query.
Exactly.
Yeah.
So I think this is a really good
solution and the other thing
about partitioning I'm not sure
you mentioned yet is is not just
kind of rolling up but you but
also just purely dropping the
Nikolay: yeah yeah yeah that's
Michael: getting rid of that
Nikolay: solution yeah or well
all there is also an attempt to
have tiered storage.
So maybe all partitions should
go to S3 or GCS or some blob storage
on Azure, how it's called, they're
still trying to remember.
Anyway, like object storage and
where we can choose the level
like S3 Glacier, right?
It's, it's cheaper.
It's slower, but people rarely
access this data and so on.
Yeah, there are several options
here.
How you need to, you need some
strategy for data management and
long, longer term, right?
For this data.
I see how companies struggle when
they're already hitting some
limits and they need to do something
about it.
But it's a lot of effort to redesign
things.
If you have earlier, like, think
earlier, maybe it will save
some.
It's harder in the beginning, as
usual.
More like, bigger challenge to
solve.
But then it
Michael: pays off.
Yeah.
I think this is such a difficult
1.
Like, you've got all of the traps.
You've got traps in both directions,
right?
You've got all of the premature
optimization traps, but then
also...
I don't...
Nikolay: Well, yeah.
If it was...
Or just use TimescaleDB, that's
it, right?
Michael: So I do think there is
an argument for it, and not just
because of the scalability, but
I think, like, developer experience-wise,
you don't have to...
Like, even with pg_partman, there's
a bit of setup.
Like, with continuous aggregates,
there's quite a lot of setup
for you if you want to
Nikolay: do that.
But this is available pg_cron and
pg_partman are there are available
everywhere.
Michael: I completely agree I completely
agree I'm just saying
from a UX point of view, like developer
experience point of view,
there's also a bunch of functions
that come with TimescaleDB
that are really useful, make certain
things much simpler to write
queries for.
So I do think there's an argument
for that but I do think you
are then trading off freedom like
there are then only so many
managed services you can use and
I think even there was a recent
alternative that stuck that popped
up from the team at Tembo,
I don't know if you, I haven't
had a chance
Nikolay: to look at
Michael: it properly yet, but pg_timeseries?
Nikolay: Ah no, this I haven't
tried.
Michael: Yeah it's new and they
left the exact same reason in
the readme, you can see the exact
reason they've done it is because
they could only have used Apache
TimescaleDB and they realized
they needed something a bit more
than that.
Nikolay: And then additional complexities
hit you when you grow
and at some scale schema changes
become a problem.
Right, For partition tables they
are much more harder, they're
not easy to do at scale anyway.
But with partition tables...
Michael: But this is, this was kind of my point, that you're
paying some complexity up front for doing this and it is kind
of a premature optimization to partition a table that is absolutely
tiny and yes it will be a pain to move from a single table to
a partitioned table later but you've also got away with some simplicity
for maybe a year or 2 maybe longer
Nikolay: well anyway I think if you have partition if you have
timescale time-serious situation yeah and you expect some terabytes
of data in a few years, I would implement partitioning right
away anyway, because this would avoid me of using some queries
which are not compatible with partitioning.
And these query is really hard to redesign later.
It's better to have some partitioning right away and partitioning
key should be involving timestamp.
And that's great.
I mean, yes, it's some inconvenience in the beginning, but it's
already like, it's not rocket science at all these days, like
many people done it and there are blog posts.
I have some how to recipe, how to partition a table with
UUID v7.
And I use, it has like example before TimescaleDB, but it's
agnostic to version of Postgres, so it can be without TimescaleDB, same thing.
And yeah, I think partitioning
is a must if you expect terabytes
of data, especially if dozens of terabytes become absolutely
must in the beginning.
And it's a pity that we don't have, like do you think partitioning
could be improved in Postgres?
In which areas?
Michael: Well, to be fair, it has been improving.
Like, there's so...
Each release, every release, there's quite significant improvement.
Well, in
Nikolay: Postgres 10 we got declarative partitioning, right?
Yes.
And then it was improving only, like, evolution-wise, like, polishing
things.
Michael: Okay.
Nikolay: Do you think UX wise big step could be done, UX wise
to simplify?
Like for example, I remember we are helping very famous company
to partition like during a weekend.
Like I think it was more than 1 year ago.
It was a AI company, super popular.
We helped them and it was great, but it was interesting experience
actually to help like to do it very quickly without downtime
and so on.
Michael: Without downtime is the hard part.
Nikolay: Yeah, but for me it's
easy because I know the, like
for me, not so easy, but I know
how to do it and how to verify
it before deployment and so on.
Hold my focus on that part.
What we missed, I'll tell you the
story, it's funny.
I was washing my car and I was
chatting to my friend also from
from a Bay Area, like I told him,
you know, we helped this company
partition, to implement partitioning.
He's, ah, because he mentioned
he's using the company.
He's using products of the company.
Okay.
He's using product.
I mentioned, okay, we just implement,
help them implement partitioning.
And he said, Oh, the service is
down right now.
I said, really?
And I'm looking at my watch and
I see 5 p.m.
And I realized, 5 p.m.
Probably this is the time, it's
like 5-0-0, right?
It's the time probably when new
partition must be created.
And this we completely missed.
Creation of new partitions.
Michael: Wow.
Nikolay: It's like we focused on
hard topics and failed in
Michael: simple topics.
Nikolay: You must create partition
in advance.
And this is on shoulders of developer.
That's not all right, right?
We need pg_cron or something.
Why is it like...
DX, developer experience, is not
excellent, unfortunately.
I think some things can be improved
in this area.
But of course it could be good
to do it evolutionarily as well.
Like maybe pg_cron should be part
of Postgres, first thing.
Cool.
Yeah, maybe.
At least, at least, contrib module
or something, I don't know.
And then probably partition creation
could be part of declarative
syntax, you define it when you
create your partition schema,
you say, okay, new partitions should
be created hourly or daily
or something.
Yeah.
Right?
Right now it's terrible, everything
on main.
I mean, it's not terrible, it's
much better when I implemented
partitioning for my first RDS project
and it was inheritance-based.
I did it with 0 downtime, it was
great, but I spent like a couple
of months.
Yeah, this is time when I started
understanding that clones are
great because you can experiment
and check everything.
Michael: Going back, I think what
I meant when I said no downtime
is the hard part, I mean new features
for partitioning generally
come in and there's so many kind
of like foot guns of what you
can do without a heavy lock and
what you can what needs
Nikolay: 1 keys
Michael: like even dropping an
index yeah so there are so many
tricky things that get implemented
first in a way that has heavy
locks and then later in a way that
can be done without, so therefore
in a 0 downtime fashion.
So I do see it as probably it's
going to only improve incrementally,
but I don't see why automatic creation
couldn't be a complete
thing.
It could also help, yeah, remember
we were talking about pg_squeeze
and that's another feature that
even though I think it was vacuum
full concurrently could come in.
With glass
Nikolay: option maybe, right?
Michael: Yeah but the other feature
it has is scheduled, let's
say like repacking.
Repacking.
Or whatever, squeezing.
So scheduled squeezing.
Well, it's different.
It would need the same kind of
logic though, right?
It would need the same, I guess
that's trigger based instead
of time based, but it's a similar
kind of Background worker type
thing.
Nikolay: Oh, yeah, maybe you're
right.
Actually.
Yeah.
I see evolution here.
First of all, I fully agree, this
whole class of problems that
are gradually solved in every major
release related to locks,
partitioning versus locking, right?
Indexes, foreign keys, detaching,
attaching, we remember every
release like last 10 years maybe.
When the 10 version was released
it was 2017 I guess.
Michael: I like to do, 17 is the
most recent so it's 7 years
ago.
Nikolay: Yes, I'm old guy, I still
remember that 95 was 15, 96
was 16, okay.
So I think you're right, and some
steps could be done here, and
experience could be improved, And
I mean in smaller steps.
And yeah, but you're talking about
repacking for to mitigate
bloat.
Previously, we talked about repacking
to make partition like
100 times smaller and having more
or less precise data, different
precision level of precision.
I think this the latter should
be on schedule.
And to be on schedule right now,
there is no option in Postgres.
You need pg_cron, right?
That's why I think pg_cron could
help to open some new features
for partitioning if it was in the
core.
Yeah, maybe we have gone sideways
a little bit from time series,
but time series and partitioning
they are very strongly connected
in my head, in my mind.
Michael: Yeah, I completely agree.
I think you mentioned Timescale
has a couple of important features,
but I would add the automatic partitioning
as another thing.
Nikolay: Oh, it's great, Yeah,
yeah.
It's just, it's very declarative.
You just say I want every hour
and then you just forget about
it.
It works really great.
By the way, also, if we, for TimescaleDB, we don't, we're not
scared to have a lot of partitions.
They're called chunks, right?
Yeah.
And it's interesting that if you
have time series, you want partitioning,
you should explore what will happen
if you have thousands of
partitions, for example.
And this provokes you to use pooling
and establish connection
less often.
Remember, I have a blog post and
mistake was made that real cache,
real cache related mistake.
So every time Postgres connects
again, first query will be slow
if you have a lot of partitions.
But subsequent queries will be
fast.
So if you have many, many partitions,
you should have connection
pooling and establish connection
not often.
Or somehow accept that the planner
needs more time for the very
first planning.
Michael: I guess it depends on
what queries, but yeah, it seems
to me like that would be...
It doesn't depend
Nikolay: on queries.
Because it's like real cache, we
need to load the metadata for
all partitions anyway.
Michael: Yeah, but let's say they're
analytical queries.
It won't matter a few milliseconds
of planning.
Nikolay: Analytical queries, time series, without TimescaleDB,
please go out.
It's not Postgres case.
Well, unless it's a small project.
If it's a small project fine but
if it's not small let's not
fool ourselves.
It's not
Michael: for Postgres.
What about the
Nikolay: roll up tables?
Michael: The roll-up tables are
exactly for that case.
Nikolay: Well yeah if you do a
very strict strategy and very
aggressive roll-up and three-stage
approach and maybe your counts
and sums and means, maxes, aggregates
will be fast, I don't know.
Yeah.
But it's easy to experiment, actually.
It's a one-day project to experiment
and check for your particular
case, I mean, to explore what latencies
will be and so on.
This is definitely worth it if
you plan some new thing to explore
latencies.
Michael: Yes, yes, yes.
Nikolay: Okay, okay.
I actually, I described this roll-up
approach not knowing it's
called roll-up myself, and then
I'm still pulling us out of Postgres.
But, well, Postgres is great.
I know when you go dig deeper,
you see some problems, you talk
about them, and then people come
to you and say you are Postgres
hater.
It happened to me not once.
Yes that's why I spent like 20
years and still not stopping working
with Postgres and thinking and so
on.
Yeah I'm Postgres hater.
Michael: Let's do, should we do
a quick fire, kind of a few last
things in terms of considerations
while we're here?
I think,
Nikolay: well, we didn't talk about
updates, unfortunately.
Yes, but I think if updates are
in place, definitely first thing
I would try to localize with them
with partitioning again, partitioning
localization.
My new fresh idea, not mine, I
actually borrowed it from a good
guy who is expert longer than me.
So idea is that if you have localized
writes, thanks to partitioning,
it also helps to backups, incremental
backups, because you don't
touch old pages.
So this makes incremental backup
lighter, smaller, right?
And recovery as well.
It's also good.
It's not about storage only.
It's not only about the buffer
pool and WAL rates, full-page
writes and so on.
It's also about backup.
So everything is connected here.
It's great to have partitioning
for large tables exceeding like
100 gigabytes.
So this is first thing, compression,
any kinds of, or this roll-up
idea, this is second 1, right?
Yeah, a couple of
Michael: things we haven't talked
a lot about because of the
volume here I think disk space
can be a big thing I know compression
really helps there but another
thing that can help is if you
do have any secondary indexes like
not not the primary key if
you do want another index, maybe
consider BRIN as well, especially
in the append-only case, but even,
we had a whole episode about
it, but with the new operator classes,
with min-max multi, it's
really quite powerful and much
much smaller than B-tree.
Nikolay: That's a
Michael: great point.
Nikolay: Do you remember the version
when it was added?
16, 17, 15?
Michael: I think it was roughly
when we started, maybe like a
year or 2 before we started the
podcast.
So maybe, yeah.
Nikolay: Anyway, in very fresh
versions.
Yeah, and also continuous aggregates
in TimescaleDB is a great
thing.
If you don't have TimescaleDB,
then something should be done,
like probably incremental materialized
views and so on.
Michael: Yeah, there's that extension.
But also there's a really great
talk by the team at Citus on
roll-ups.
So I will share that in the show
notes.
There's also on this topic there's
with a lot more practical
tips.
There's a talk by a fellow Brit
of mine, Chris Ellis.
He's done a talk on, it was about
Internet of Things, so sensor
data, but it's basically just time
series, right?
So I'll share a link to that talk
as well.
Nikolay: 4 points.
What's the last 1?
Number 5.
Michael: What about language stuff?
Like, Chris, in his talk, I remember,
talked quite a bit about
window functions and interpolating
gaps.
There was probably some considerations
there that you probably
still want access to SQL.
Postgres is so powerful there that
I think some of the other
specialist databases, it's slightly
harder to do some of that
stuff or it's a lot more it's a
lot more learning.
Nikolay: My favorite function is
lag.
Yeah.
I deal with A lot with lags in
various kinds of replication.
And here we have a window function
called lag, right?
I thought you were going to
Michael: say your internet connection,
but yeah.
Nikolay: Well, this is as well.
Lags everywhere.
Anyway, let's consider this number
5.
Postgres has great SQL, standard
SQL support, window functions,
a very rich set of them, and here
it can help as well.
Michael: Yeah, last thing, we have
mentioned Timescale a lot,
but there are other extensions
like Citus, Hydra as well, worth
considering for specific use cases.
And pg_timeseries, I'm gonna check
it out.
I'm not sure quite what they've
got in terms of features yet
but it's Postgres licensed.
Nikolay: And pg_duckdb episode which
is new thing we had pg_duckdb
episode.
Michael: Yeah but I haven't considered
it for time series data.
Nikolay: Do it!
Yeah, why not?
Well, all things are fresh.
I just saw how my team member used
DuckDB just locally to read
some CSV logs, snapshots actually
collected from Postgres.
It was great, like it was some
SQL, DuckDB SQL, and super
convenient as well.
So I think we have a lot of things
and tools which are open source
and that's great.
Michael: 1 more idea.
Even the topic from last time was,
or 2 times ago, was pg_dog,
like sharding.
If you're getting to extreme, light,
heavy workloads, you could
shard by, you could have time series
partitioning on every shard,
but then maybe a range of device
IDs for each 1 and shard based
on something other key.
Nikolay: Right, yeah.
Thank you for this little brainstorm.
It was a pleasure to talk to you
as usual.
Michael: Likewise.