Postgres FM

Nikolay and Michael discuss timestamps and time math in Postgres — particularly around data type choice, functions available, and some things to watch out for.

Show Notes

Here are links to a few things we mentioned: 

------------------------

What did you like or not like? What should we discuss next time? Let us know by tweeting us on @samokhvalov / @michristofides / @PostgresFM, or by commenting on our Google doc.

If you would like to share this episode, here's a good link (and thank you!)


Postgres FM is brought to you by:

With special thanks to:

Creators & Guests

Host
Michael Christofides
Founder of pgMustard
Host
Nikolay Samokhvalov
Founder of Postgres AI

What is Postgres FM?

A weekly podcast about all things PostgreSQL

Michael: Hello and welcome to Postgres fm,
a weekly show about all things PostgreSQL.

I'm Michael, founder of PG Mustard.

This is my cohost Nicolay, founder of Postgres ai.

Hey, Nicolay, what are we talking about

Nikolay: Hi Michael.

Let's talk about something priceless.

Time, time stamps, time zones, and so on.

Michael: I like the priceless reference.

That's great.

Yeah.

So this is another, uh, listener
request from quite a while ago.

Nikolay: Let's keep this episode short.

It's about time.

Michael: Oh, nice.

Um, yeah, it's a great suggestion.

Great request.

it's one of our earliest requests actually.

so it was specifically about time stamps in
general and time math, or arithmetic maybe.

So yeah, it's quite broad, but also, I guess
we can take it in a few different directions.

where did you wanna start?

Nikolay: Let's start with idea
that, timestamp with time zone.

Should be always preferred over
regular timestamp without time zone.

What do you think about this idea?

Michael: yeah, so I guess we're
jumping into the types here on how
we're storing these in the database.

And there's a couple, that Postgres
lets you, use timestamp and timestamptz.

So timestamp without time zone
and timestamp with time zone.

I think it's quite a confusing name.

and the reference I normally point
people to is the Postgres Wiki.

They've got an excellent list that's quite
aggressively titled, called Don't Do This,
but one of the, they, it is, it's aggressively
titled, but it also includes information.

you, when it might make sense,
you know, what are the exceptions?

But one of the, one of the ones
that has very few exceptions is you
probably always want time stamp tea.

Uh, when you're storing

Nikolay: Right.

And, uh, since it's couple of more letters,
, it's easier to write Justin Stamp and then
realize that maybe it was not the best choice.

I would, uh, like to, uh, mention
that, Uh, usually it's my perception.

Usually you think about timestamps as something
bigger than regular numbers, integers, especially
if it's with something else like timezone, right?

if you think about storage, You maybe
think, okay, I maybe need to use regular
numbers and of course I don't want to bump
into the limit of four byte integer, so
I will choose eight byte integer, right?

So in, in eight or begin, and then you realize
that regular timestamp is also eight bytes.

And then you realize that, timestamp
with timezone is also eight bytes.

They are all eight bytes.

So the choice should not be based on
storage here in, in a big end timestamp.

Or timestamp.

Is that timestamp with timezone?

If we timestamp, is that, is, is, is Pause system.

because it's a standard way, SQL
standard way or data type is timestamp
with time zone for four words.

so I personally try to encourage all engineers
usually to choose by default choose, times
stamp to that explaining that there is no
storage benefits of regular times stamp, right.

Michael: Yeah, exactly.

And there's loads of benefits
of the timestamp T z.

So for example, if you are doing, I guess we'll
go, go onto arithmetic later, but if we are.

Calculating the difference between a timestamp,
at least in the uk I think a lot in the us.

A lot of other countries observe daylight savings.

If we're doing arithmetic
between two timestamps on what?

And we're crossing a boundary,
like a daylight savings boundary.

We won't get the right answer if we or
we are more likely to have issues if we
are using time stamp without time zone.

so Postgres can handle all of that stuff,
that complicated stuff, the reasons people
talk about times being difficult, that that
can all be handled by Postgres if we're
telling it that this is, um, and I guess
should we, should we mention the name?

It's not the, with timezone, I think.

people off and is confusing, from the Wiki
I stole, their description that times stamp
is a bit like a photograph of a calendar
and a clock, but you have no context.

and.

Timestamp with timezone.

It knows, or it's firstly it can convert
that, or it does convert that to utc and
it, it has the context of a point in time
so it, it knows when that is, is rather than

Nikolay: Right.

And, uh, the resulting value will depend on the
context which you can change, in any session.

So one session sees one value, final value,
another session sees another value just because
they are sitting in different time zones.

So that's why it's more, flexible
and convenient and so on.

Michael: Yeah.

So you've already, you've already
touched on the storage, efficiency.

are there any other, Kind of performance
related things that you wanted to

Nikolay: I

Michael: with

Nikolay: good question.

I don't know.

I never thought about like
timestamp to that less performance.

I never try to measure it.

It's a good question, but I think I
doubt there is big difference here.

Michael: Based on my understanding of how
they're stored, I don't see how they could be.

but yeah, it let us know if you know

Nikolay: Yeah.

You need, uh, of course you need to, to like
to do some etic uh, evacuation in run time.

if you have with time zone.

But I don't know.

I think it's, uh, very small, of
course, an interesting exercise to
compare and, and see some differences.

Maybe there are already these
exercises in some blog post, I dunno.

But, uh, I wanted to mention a
dangerous part of SQL Standard, which
is, a operator called at Time Zone.

It's very dangerous.

It's the ugliest parts, the ugliest corners of.

This huge building called, SQL Standard.

So it can be very tricky if you don't
know that is d It's like mouths, right?

So mouths are dangerous and
at time zone is dangerous.

If you check, how it works and, uh, what
is resulting data type for which value
For timestamp, it'll produce timestamp at
time zone for times, stamp at time zone.

It'll produce times.

. If you take timestamp without time
zone and say at time zone it, you will
have timestamp at time, time zone.

If you take a timestamp at some
time zone and at time zone, it'll
produce timestamp without time zone.

And this is, this is from standard as I remember.

So if you go there, just
open the reference , right?

And, uh, do any movements with reference.

No movements without it, right?

just, just a warning, for.

So, performance, I, I, I didn't think we will
discuss a lot of performance, uh, questions.

Maybe just the idea that, of course,
it's perfectly indexible, both timestamp
and timestamped, is that, If you
have such columns, you can the exam.

But I had several times I had a case in my
engineering, pr engineering practice when
I needed to have, um, something like age
by, by the way, there is such function age.

, which, uh, you can give either two time stamps
for to it, and it'll just calculate interval.

Oh, there is interval type data type.

It's it's whole new world time
ranges, intervals and so on.

so you can give, two points in time
and it'll, it'll calculate difference.

Or you can just give one timestamp and
it will compare it to the current time.

Right.

And, uh, it'll be just a regular like age.

So you say like, my birthday is.

and it gives you age, uh, interval.

But, I wanted to store it this interval and
then I wanted to index it several times.

And, uh, always it was a question how
to do it because you cannot have an
index, uh, on something which involves,
time zone because it's volatile, right?

It depends on, on the context, on your session.

If you, if you change time zone
in session saying, say time zone.

, I don't know, like, Europe,
Berlin or something like this.

Uh, in this case you will,
or new time zone, by the way.

Uh, Europe ki there is new time zone.

Yeah.

But it's not a benefit from
PO's coming from POGS itself.

It's it's lower level.

It's, sorry, I, I don't know libraries, but
it's coming from under underlying software, So
if you, set it in session, your value will be
different and probably interval will be different.

So you cannot index it easily.

If, if you, for example, you want to index
expression, You store time zone, but you want
to index to support fast lookups based on age.

And, uh, my work around was, uh, I
switched different approach to store time.

I, based on Unix times stamps and so on.

I just, said, okay, but Unix
times stamps are limited on.

On the beginning, uh, 1970, maybe January 1st, I
don't remember, but some point in time in 1970,
and then it's just a number of seconds passed.

Unix timestamp is number of
seconds passed since then.

By the way, I wish in Povi it would
be easier to convert from timestamp
to Unix timestamp back and forth.

It's, it's quite, Quite an expression,
you need always to Google it and stack or
flow or something will help you, right?

it's memorize it, like memorize it.

It takes, uh, like five years of practicing
constantly to, to write it blindly.

So, hu timestamps are limited on
one and they are technically limited
cause of capacity ofs on another end.

my idea was, okay, I say our software will
exist only till 2050, so we have some line and
we can measure, we can see progress, and we can
have ranges, and this is very stable already.

Of course, I lose time zone benefits and flex.

Convenience.

But, uh, I, I'm on perfect.

Um, range of numbers we live,
this is our universe, right?

And we deal with only with points on this range.

And it's when you think you do there,
you can index any expression, anything.

So this is what my approach as working around.

Maybe there are other approaches to, to
deal with age and, uh, and, uh, still
build an index on some expression.

Involving age, but usually it's a trick and
people bump into it and think, oh, what to do.

So, uh, what else?

What else do you have in
mind related to timestamps?

Michael: So well quickly while we're like, cause
I don't think we'll come back to performance.

I guess this is a good time to mention that.

Obviously Bree indexes work nicely for,
uh, things that are sorted like this.

But we've also had a, we've done a, an
episode on BRI indexes, so that's a,
timestamps are probably the most common.

Uh, use of print indexes that I've seen.

I, I think that's what they, they're
mostly used for in the world, so, yeah.

Refer back to that maybe if that's of interest.

you mentioned intervals.

Should we cover those

Nikolay: yeah, but, but one,
one more comment about indexes.

Uh, there is a very popular
type of index called gin, right?

And there is big problem, uh, that usually you
have a full text church, and you want to order by.

uh, number all timestamp.

And to solve this, there is index called ram.

it's not present, I think on rds.

I, I remember I personally asked
them to consider it, but the problem
with it, usually index size is huge.

I know super base by the way, includes it, ram,
I, I wonder what is, what's experience there
because my experience was, I considered it several
times, and, uh, every time I considered it around.

Huge.

So idea of Ramm is you have Gene, plus you have
knowledge about some timestamp or integer eight.

They originally developed only
timestamp, support, but I asked also
to at INTE eight, they they, it was many
years ago, so now it has both options.

So Ram is like gene extended with some times.

So now you can have, in single index
scan, you can have almost single index
scan, but you can have, uh, both.

Operations covered, full text, search and
order by, uh, sometimes stamp, which is
perfect for, like search in some comments.

For example, you always usually want
to fresh comments coming first, right?

So it's very great idea.

I wish post this, uh, like it would be
more include to core contribution models,
maybe anti contribute models, but it.

And it has issues with size and,
and affecting performance and so on.

But idea is great and like, I mean, the goal
is great, so also it becomes timestamps.

So it's interesting that it injects timestamp to
gin and it becomes around very strange, right?

Michael: At the risk of going o off topic,
I know gin is not named after the alcohol.

Uh, and one I think is a play on that, that it is.

But there's also, I heard, I only read
briefly about it, but, uh, an index type

by the

Nikolay: was only like, uh, in development
and never reached some like production stage.

But Ram is quiet like.

quite finished and you can consider it,
but like I just, I'm just saying that
my personal experience was not perfect,
but doesn't mean that there is no need.

Need is is huge.

Like usually we end up, having two types of plans,
like either posts either chooses to use full check
search and then order by memory, or it chooses
to walk along primary key or index on times.

Tease that and then, uh, apply filtering,
in memory for full tech search, which
both are not perfect paths, right?

So like the problem is obvious and it's
not fully solved, but with Ram solved
again, like maybe you should try it.

So, we should also mention that there is,
uh, there is date and there is time, And
it, when some engineer creates something,
some column call, calling this column time,
but it's timestamp, it breaks my mind.

Always because like, okay, this,
this, uh, frame framework of.

In pogs, I like adopted it already and
I cannot, see time when it's timestamp.

No, it's not time.

It also includes date, like
at least let's say daytime.

By the way, maybe standard also has daytime.

I remember other database system have daytime
as data type, so there is date, there is time.

When we combine them, it's
the is time stamp, right?

Like it's full thing.

Michael: Yeah.

Nikolay: we can have also a column of type.

Right Or of type time.

Why not?

Michael: Yeah, but I think date gets date's
difficult because of the boundaries, right?

Like what does like midnight and time like?

I think it's tricky.

I think personally I would stick to
timestamps in general, but what I, maybe

Nikolay: Well,

Michael: the point of

Nikolay: uh, too better in terms, two times
better in terms of performance because, uh,
in terms of storage, because of it's four.

Right.

So you can save some bites if you want, but you
should remember of course about alignment page if
you place date, and then enter eight, no benefits
because uh, four bites will be filled by zeros.

So you should combine with other four bite column.

I don't use it often, honestly, so I agree
with you here, but also there is time and
sometimes we Technically we can use timestamp
as time saying like, you can use timestamp
instead of date, taking some time of day.

But also you will start thinking, okay, if
it's time with time zone, it's another date
can be different depending on time zone.

So it's, it's, it's becoming tricky.

So probably this is where
you probably will try to.

Timestamp without time zone, just to be concrete.

Okay.

It's utc, I can see that as utc.

This is our date.

That's it.

Again, depends, right?

But if you think about time, technically you
can use timestamp, just like choosing some date.

Always like January 1st something, right?

I don't know.

But it's not, uh, it took ugly, right?

So time sometimes is.

Michael: Yeah, so there's.

There's time with time zone as well and time, and
there's really good note in the documentation.

And on that wiki page I mentioned earlier to
say basically I, I think there's no exception,
they couldn't think of a single exception
where you would want time with time zone.

Uh, they go into some details as to why
that's, um, why that makes sense, but
time on its own makes sense in terms of.

a photo of a clock or like a, if, if
you're, if you're a calendar application
or a to-do list application and you want
the meet, like let's say you want to
go walk your dogs at 9:00 AM every day.

You want that to be 9:00 AM summer,
and maybe you want that to be 9:00 AM
in winter, but it, it's like, it's,

Nikolay: Yeah.

Michael: the time of day, not,
uh, not a time in, in with a date.

And so I think I, I can see more uses for that.

, and I probably would store it
as time because it's not, uh,

Nikolay: Yeah, maybe.

Yeah.

Also was mentioning they all, both timestamp,
both of timestamps, and both of times, uh, they
have precision, um, how to say modifier, right?

So you can, in parenthesis you
can say, I, I don't want a micro.

Just put a zero there and that's it.

Also convenient.

I use it sometimes because like it
becomes easier to have predictable output
without just now converted to timestamp.

And zero in parenthesis gives,
gives you, timestamp with seconds,
but without my milliseconds.

Michael: The Wiki advises against it, I think,
because it rounds rather than truncates.

Nikolay: Okay.

Michael: it, it's possible to round to a time

Nikolay: Okay.

Michael: which might not be what you want,
but yeah, it's interesting and they, I think
they recommend truncate instead date trunk as

Nikolay: Okay.

Michael: or time, I

Nikolay: Date trunk will give you Yeah, it'll
give you, you can give, you can take seconds.

Yeah.

Right.

So, yeah.

And everything else will be
filled like with zeros, right?

Right.

Or it just can be, if you want, for example,
to take on the date from like my small
trick, usually if I have timestamps, I,
I want only date, for example, to then
aggregate and have some report I use left.

left timestamp value, comma 10.

It gives you date usually, right?

It's, it's stable because we have four digits for
year and divider and like, and so on and so on.

So left is much faster, but of course
they Trunk Day for example, also works.

. So it depends.

It, it depends.

But left function, I love it.

It's, of course it's to work with text, but
it's, it's much shorter, sometimes, so, yeah.

but interesting that time is a
regular time without time zone.

It's also by, but with time zone, it's 12 bites.

Michael: Yeah, I didn't understand that.

How could it be more?

Nikolay: Surprise.

I don't know why, but surprise.

So that's why I usually try to stick
with time stepss as, as long as I can.

And only then I, I go to less, uh,
for me to less popular, data types.

But interval is, interval is even worse.

It's uh, 16 bytes . So

Michael: But I, yeah.

So why, why is that?

Is it because it's storing two timestamps?

Okay.

Nikolay: our listeners.

Leave a comment if you're watching
a YouTube for, for leave a comment.

Please explain why it's interesting question.

Well, interesting but not like if, again,
if you store it or your care, but you
don't not have good, a lot of good options.

Interval is, Very good.

Um, it's very powerful and, uh,
very flexible, very convenient.

I use it a lot, including for dba practice.

For example, you want to understand how
long your transaction or statement lasts.

Or when last state change happened, and
you want something like h you will have,
you can, you can use minus separator
and you will have interval as output.

By the way.

Question to you.

what would you use if you select
from activity, uh, to see when
transaction, like transaction age for.

How would you use it?

So there is exact, exact start,

x a c T under score start, column.

And there is of course current timestamp.

So most people, not most people, many people
use now function minus, uh, exact start.

But uh, it's quite.

Now function is very good in terms of
performance because it gets timestamped only
once in the beginning of your transaction.

So your, your, your select
is also a transaction, right?

So now is calculated only in the beginning,
but activity, it's not a normal table.

Uh, and, uh, values are constantly shifting.

And we, if it's quite big while we
read it, we, we, it's not stable.

It's not, we don't have snapshot there.

You will have like shifted values.

You probably will end up with, uh,
negative age negative intervals.

So what you should use is less performant
because less performed function because it
will calculate, current timestamp for each row.

It's called clock time.

So clock, times stamp minus exact
uh, start and um, transaction start.

Right?

So, and in this case you won't have negative
values, which is good way, . So yeah, small trick.

And sometimes it's also worth remembering that
now it's like, it's like, boom, and you have
it for all, but maybe it's not what you want.

Okay.

Good.

Uh, anything else we wanted to.

Michael: I only had one more, which is
only tangentially related, but I heard some
good advice a while back that if you are
adding a column that's, uh, boo, that you're
probably are planning to use a bullion.

For consider using

Nikolay: Hmm.

Michael: So I think the, the, the good cases
that I've heard, uh, for this are things like act

Nikolay: Yeah.

Yeah.

Michael: it's basically, sometimes you don't,
you don't necessarily need it for the first set
of features, but it can be incredibly useful

Nikolay: Yeah, I also use it, this approach.

Yeah.

Yeah.

But you should remember storage, uh,
overhead, of course, because bullion, of
course, it's not one bit, it's one bite.

It's already, you already are eight.

Time worse than you could be . Oh, you can
be like most, uh, storage, uh, efficient.

uh, approach would be like, uh, spend
one bite for eight, check boxes, right?

And then, work with bit operations.

But, uh, okay.

If you have bull column, it's one bite.

Of course if you, if you pack, if you have
eight of them, they will occupy eight bites.

It's good if you have boo and then
timestamp or integer eight, you
will have seven bites lost anyway.

But, uh, in this case, of course, uh, switching
to timestamp won't, uh, be, be, won't make
any difference in terms of storage overhead.

Right?

Michael: Yeah, but it was

you're right in terms of storage.

more of like a, I thought, I think it's like
quite a good thinking exercise, when you're
designing a new feature, just to think, would
there, could there be benefits down the line of
storing this in a different, different format?

Um,

Nikolay: When,

Michael: it makes, it's made

Nikolay: when you, when you are designing,
think about if you need, uh, to remember
when check boxes are, you are pressed, right?

Clicked or checked, right.

So yeah, it's, it's depends on the application of.

. Oh, I wanted to mention, uh, these old balls.

You know, old balls.

What, what is old balls?

Michael: Oh yes.

So there's,

Nikolay: Yeah.

Michael: time, the

Nikolay: All right.

0, 0, 0 time utc.

Right.

So, yeah, it's a, it's kind of
like interesting joke maybe, right?

So, but,

Michael: but it's committed, right?

Nikolay: It works.

Of course, you can say,

Michael: Yeah.

Nikolay: oh, by the way.

Uh, how to, how to get, um, interval,
how to get, uh, current time minus one
day, like now minus, I write one day
in, uh, in, um, comas, not in comas in.

Codes, single codes, and then
converted to interval and that's it.

Or say minus interval.

And then, uh, string, literal, one day it
works perfectly or one hour one, like it works.

But, uh, also there is, uh, so wall ball
sits like also in single, uh, quotes.

You converted to time and.

this time, but, uh, like it's, it's kind of a
joke, but, uh, what is more useful is infinity,
minus infinity and, uh, plus infinity, right?

This, uh, is useful.

You can have a timestamp infinity, and, uh, I can
imagine very good use cases when it, it's like
very, very helpful to develop good application.

Michael: that's a really good point.

So I, um, there's.

feel like I've mentioned it too many times
already, but that Wiki article I think also
includes not using or between probably not being
what you want when it comes to, timestamps.

And you probably want less than
one and greater than another.

And that's, I, that's where
I've seen infinity used most.

So you might want to be, selecting all
times from in the future to positive
infinity or from sometime in the past.

But less than, uh, minus

Nikolay: All right.

Right.

Agree.

Good.

So what else, or that's it.

We wanted to, to keep it short.

Michael: all I

Nikolay: Yeah.

Then let's wrap up and uh, call today.

Right.

Michael: Yeah.

Fabulous.

Thanks so much.

Thanks everyone for listening.

Keep the suggestions coming and see you next

Nikolay: Thank you.

Bye bye.