Postgres FM

Nikolay and Michael discuss over-indexing — what we mean by it, the regular issues people discuss about it, as well as a novel one Nikolay has come across and benchmarked recently. 
 
Here are some links to things they mentioned:

~~~

What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!

~~~

Postgres FM is brought to you by:
With special thanks to:

Creators & Guests

Host
Michael Christofides
Founder of pgMustard
Host
Nikolay Samokhvalov
Founder of Postgres AI

What is Postgres FM?

A weekly podcast about all things PostgreSQL

[00:00:00] Nikolay: Hello, this is PostgresFM, and it's Michael. Hi, Michael.

[00:00:05] Michael: Hello, Nikolay.

[00:00:06] Nikolay: Oh, my name is Nikolay, right. So, do you remember the number? Episode number?

[00:00:14] Michael: 68, I believe,

[00:00:15] Nikolay: 68. Oh, yeah, great. And the topic today will be, I chose it, and it will be overindexing. Next time we will talk about underindexing, I promise. Right?

[00:00:26] Michael: Well, yeah, I mean, this is the, this is the age old thing. I, um, you did a tweet series about overindexing, and this is definitely kind of one big category of issue, where Both of us have come across people that have added a lot of indexes to certain tables and that's caused a certain category of issue.

But, I'd actually say that in the majority of cases that I see that people have performance issues, it tends to be under indexing that's the problem for their cases. but yeah, maybe another, a whole other topic for a whole other day.

[00:00:59] Nikolay: Right, well, right, right, right. If you start without indexes, just with primary key maybe, maybe even without it. Right, then you find out that that indexes is That's a good thing. They didn't teach you when you learn SQL standard about indexes because they don't talk about it at all. But then you discover them and so, right, so we have phases here.

Phase one, you discover indexes and understand that lack of indexes. Here's your performance, your database performance. But once you discover indexing, I bet you tend to do what we will discuss today. You create indexes too much. The silliest approach is to index every column.

[00:01:50] Michael: Well, and worse, right, because I often give that example and I forget that it's not even only as bad as that because... We often, if you want to, let's say we're trying to optimise a specific query, often the optimal index is a multi column index to help us get exactly the right conditions and get an index only scan.

And once you introduce multi column indexes, it's not even just an index per column, it's an index per combination of columns in a specific order. So the number of possible indexes, even if we don't... Start to consider duplicates and overlapping ones. It is not even just the number of columns, right? It's way higher.

[00:02:29] Nikolay: In general case, the optimal index is multi column involving some expressions instead of columns, and also partial.

[00:02:38] Michael: Yeah. And yeah, of course.

[00:02:41] Nikolay: Well, okay, so, Nobody says we always need just one index per table. No, that's not so. Of course, we need multiple indexes. But what, I'm trying to... State is that having too many indexes hurts not only write statements, but also reading statements, selects. By the way, my tweet, it's, it's just one tweet, it's not a serious.

Serious is Postgres marathon, uh, which is about everything, not only performance, all things Postgres as well, and just with some practical examples. , but in this case, so many people, when. they pass this phase one lack of indexes, and they are in phase two overindexed. They realize, and it's like, it's discussed a lot, that indexes hurt too many indexes, extra index.

We'll slightly slow down write statements and by the way, I have, um, I made a mistake in my tweet saying that deletes are affected. This is my usual mistake. I think I made it during our podcast sometime as well. Deletes are not affected. Deletes just put... Xmax colon value to tuple header, and they don't change indexes, so deletes are fine.

Affected, read statements are inserts and updates. For updates, there is optimization technique called HAT, heap only tuples update, which sometimes works, sometimes not, depending on couple of important conditions. But my statement is... Selects are also slowed down and we can discuss, like, things step by step.

Let's talk about writes first, right? So again, inserts and updates. Why are they slowed down? Because Postgres Creates new tuple, each insert, each delete will create new tuple, even if you roll back the operation. If you roll back operation, they will be dead. New tuple will be dead, right?

And when new tuple is created, index entry has to be created in all indexes. It's called index write amplification problem in Postgres. One of the reasons Uber moved back to MySQL. Very infamous story, right? And for inserts, there's nothing you can do. Because we do need to put additional pointer to each index.

But for updates, there is hot updates, heap only tuples. Two conditions needed to be met. First is, there should be additional extra space in the same page as old tuple. And second condition, you should not change values. in columns which are participating in index definition, and it can be even WHERE clause if it's a partial index.

So, the basic example here is, if you have. Updated ad column, and by the way, Jeremy Schneider from AWS RDS team mentioned it in the thread on Twitter as well. So if you have updated ad column, many people tend to have it, and then you have an index on it. Of course, when you change data, you change updated ad value, but since it's an index, you never will have hot update for such updates.

It's impossible. So idea is maybe you should not index it. Try to avoid it. But we, we might want it because probably we want like last 25 updated rows or something, right? So what, like, let's discuss this. What's the best approach here? You have updated that column, you update, which maybe by trigger, for example, each update changes, uh, this value.

But what to do? I want fast updates, I want hot, hot updates. So tuple is

[00:06:41] Michael: I mean, this, this is one of the things I love about performance that you just get hit with these trade offs and it's going to depend on your case, right? What's more important to you? So, what are you most limited by? What's your most precious resource in this case? So, I don't think there's a generic piece.

I think people try and give generic advice. I've definitely tried to do it in the past as well. Sometimes it feels like you can get like an 80 20 rule. This one feels like it's... I know it's a bit of a cliche now, but it really does depend, right?

[00:07:12] Nikolay: right, right. Well, but how, like, let's, let's slightly discuss how it depends. First of all, I forgot to mention hot, hot updates. It's when you, like, I explained conditions, but what is hot update? Hot update is, it's when during update Postgres doesn't need to touch all indexes. So it's, because if you have 20 indexes, it's hard to touch them.

It's expensive and slows down update.

[00:07:37] Michael: And this is because the tuple in the heap doesn't move page right?

[00:07:41] Nikolay: the same page, yeah.

[00:07:43] Michael: and we get like a, I think this, I can't remember the wording for it, but it's like a little bit of extra information gets added to that. So it's like a

[00:07:49] Nikolay: additional jump inside page. Right, so like, index keeps the pointer to the same page, same buffer. Basically, like, when we read something, we need to read whole 88 KB block, buffer, page, and inside it, old tuple Now, say there's a new version, it's super fast to just jump to a new version inside it.

[00:08:14] Michael: and the space on the page thing is important as well, because if you want to optimize for heap only type of updates, tables by default have a fill factor of 100%, don't they? But we can reduce that.

[00:08:28] Nikolay: Artificial bloat. Right.

[00:08:30] Michael: yeah, but if we, if we reduce that to something a little bit, giving a little bit of space by default on each page, yeah, sure, first our table takes up more space, but, in future we might be able to, Get that net benefit of our heap only tuple updates.

[00:08:47] Nikolay: Or just, uh, here we just can benefit from slight bloat we, we have. Because like if some space is left after some tuple became dead and then cleaned up by autovacuum, this is how, like, I mean, I, I'm sure. Most times in the world when hot update is executed, it was in tables with fill factor being default 100. It's just working because Autovacuum already cleaned up some old dead tuples. Right. So, okay. And, uh, in this case, we, of course, if we have updated at column We need to decide if we have index on that column, and we use it, we rely on it a lot in our queries, for example, give me 25 last, uh, lastly updated.

Rows. In this case, probably we want to keep this index, but we understand that updating updated at will lead to, to degradation of this, right? But, of course, like, we also need to, to see the full picture here. If, when, when we update updated at column, Change it. When we change other columns, maybe we already lost, hotness feature of our update, right, because, because we anyway changed something which is indexed, right, so, but in some cases I saw, like, and I recommended, let's, let's just, maybe, let's just get rid of this index on updated data.

Our updates become very fast because we have too many indexes on this table, right, we need to update all of them. In this case, we have, for example, we have some payload, Which is, I don't know, like some JSON, and it's not indexed. We change something there, and automatically we change updated add. In this case, if we don't have index on these columns, neither of them have index.

In this case, update is very fast, but how to find... Recently updated rows. I remember I wrote a couple of times, relying on the index, so either on ID, if it's a sequential number, or on created at column, which is like default now, right, current timestamp. Trying to find like, like lossy search, you know, you find more and then if it's not enough, you get next, you search by created at, but you try to find updated at.

It was some for maintenance, some approach. I just wanted... I just, I knew I don't want index on updated, so I tried to find workaround. It's an interesting like engineering but of course it depends on the case you have. But it's worth keeping in mind such updates are quite slow. So, and speaking of inserts, probably you just want to insert in batches if you can, because in this case many tuples will go to the same page and it will be faster compared to single row inserts.

In separate transactions, right? So, in this case, it might be faster.

[00:11:47] Michael: So we've got the write amplification problem. We've got hot updates. Not only hot updates that can help alleviate that, but also, it's another problem of over indexing, right? If we've indexed loads of columns that we don't ultimately need, we prevent those.

So there's both sides of that. Argument. And then, is there anything else you want to talk about on those before we moved on to things

[00:12:09] Nikolay: course, we had an episode about index maintenance. So, the easiest way to improve the performance of write operations, exclusion, deletes, I will never forget anymore. You need just to perform index maintenance regularly. Like, once per few months, you check all... Unused indexes and redundant indexes you have by far and just remove them.

[00:12:34] Michael: yeah, so by unused indexes we're using, um, is it, I've written it down here, is it pgStatUserIndexes, uh, and not, not just on the primary, ideally if you're doing, if you're using readReplicas also on readReplicas,

[00:12:48] Nikolay: If you have multiple setups, check multiple setups, because there are difficult situations. For example, some software you install to users, like, to customers.

[00:13:01] Michael: oh yeah, like a multi tenant thing,

[00:13:03] Nikolay: Not multi tenant, opposite. Multi tenant is... We have one database and many, many...

[00:13:08] Michael: oh sure.

[00:13:09] Nikolay: Or if multi tent is also interesting, if you have multiple databases, probably you need, or multiple tables, the same structure happens sometimes.

You need to check them all to understand the pattern, maybe not of them all, like maybe just like 10 percent to understand the structure. But I'm talking about when you install software, like for example GitLab, you install it, customers install it, and who knows, maybe you checked several cases, index is not used.

But who knows, maybe like 5 percent of your customers need this index. And also time matters. You need to check long duration of usage. We mentioned example when some indexes is needed a lot, but on the first... Day of each month. If you, if you happened to check, uh, usage, uh, you reset statistics in the, in the middle of month and checked in the end of month, you might not see the pattern, uh, well enough.

So, checking usage, uh, requires some understanding as well and, uh, should be done carefully. Redundant exists also, like, some people have fears. That if some redundant in this looks like redundant, and we know like index on column A is redundant if it is an index on column A and B, so our A is on the first place, so we consider this index as redundant.

But what if it's used? It can be used because it's smaller and the planner thinks... Cost is lower, it's

[00:14:44] Michael: Oh, yeah, it, it will likely be used because it is smaller, right?

[00:14:48] Nikolay: right,

[00:14:48] Michael: but it doesn't, yeah, you're right, um, good point. So Redundant can even be, well, so Redundant can be these overlapping ones like you described, but it can also be, literal duplicates like Postgres

[00:15:01] Nikolay: oh yeah, duplicate it.

[00:15:03] Michael: yeah. so Postgres doesn't stop you.

I think that surprises people. It doesn't stop you creating the exact same index definition as many times as you want.

[00:15:13] Nikolay: Why not? So, yeah, and, and, uh, just try to achieve the minimal set of indexes you really need. This is the general advice here. But let's move, move on to more interesting and... Topics which are not usually discussed.

[00:15:29] Michael: I had like a, I had like one, at least one more kind of easy one that we probably don't need to discuss much, but, I think people often forget that they're competing for space in the cache as well. Like if you're in a position where you don't have, or your working set doesn't easily fit in memory, the more indexes you're creating, the more competition there is for that limited resource.

And then the final point was you mentioned index maintenance. but it's also a point if you need to rebuild everything, if you, like, for example, there was the, the issue in 14 that was fixed, I think in 14. 4, um, we needed to rebuild all our indexes. The more indexes you have, the longer that takes.

Like there's, there's a few things where the more you have, the more maintenance there is to do. Um,

[00:16:18] Nikolay: exactly. Uh, indexes, uh, health degrades over time. And you're talking about, uh, the case when like we, we rebuild index We've concurrently option, either creating this concurrently or re index concurrently. And during this XminHorizon is being held and Autovacuum cannot delete freshly that tuples.

And optimization was attempted in 14. 0, but all versions between 14. 0 and 14. 3, they have bug in this feature. And so fixed in 14. 4, yes. And Yeah, so you're right, we didn't mention basic overhead from extra indexes, it's disk space and also buffer pool usage

[00:17:05] Michael: good point on this space.

[00:17:06] Nikolay: and actually space in the page cache as well, so in memory.

So the more indexes you have, the more memory you need to maintain, Postgres needs to maintain them during operations, even if you don't use them. They need to be updated, so they need to be loaded to the buffer pool to be updated, and it's not good. But let's, let's talk about even more, like, interesting, like, I, I mean, like, what we, what, uh, this tweet discovers and talks about, probably it's not super big problem, but, it's a surprise to many.

Like, we have a small table, or bigger table, we use pgBench accounts, and, I mean, regular pgBench with scale like 100, it means like, I don't remember, 10 million rows or so, like small, small table. And , then we just have, by default, we have only one index, which is primary key on this table.

And we just check the performance of primary key lookup. It's simple. Select only workload pgBench provides by default with option dash capital hyphen capital S. Selects only. , and then we just check performance, check planning time, check execution time, and then we add one more index, right? And we see that, uh.

Performance of planning time degrades, and it degrades noticeably, like first degradation is like more than 10%, 16 percent maybe or so, like if you check, even planning time only, right? Execution time doesn't degrade, but planning time degrades. And then one more index also like 9 percent or so, and then more and more, and then this like, the speed of degradation goes down a little bit, like to 5%, but each index a penalty to planning time 5 percent or so. It's significant. I mean, like, it doubles very quickly by, by, I don't know, like, by index number

[00:19:08] Michael: nine, I'm looking at

[00:19:09] Nikolay: ten, or something. Right. So, having ten indexes compared to one index is like, , twice longer planning time. By the end, we reached 20, 20 indexes, and the degradation was more than 4x, so more than 300 percent was added to planning time.

But there was additional effect at some point, we will talk about it slightly later, it's even more interesting. But this is like super basic, Index means more planning time to basic queries like primary key lookup. And why is it so? It's because the planner needs to analyze more options for more plan options, right?

Obviously. And,

[00:19:53] Michael: Is this, just to check, are we adding the equivalent duplicate indexes on the primary key column? So it basically has a choice of all 16 or all 20 of the indexes, so it has to choose which one is likely to be the lowest cost.

[00:20:08] Nikolay: right, right, right. So, yeah. This exactly shows also how bad it is. It's bad to have too many indexes, especially duplicate indexes or unused indexes, even for selects. But it affects only the planning time. And, of course, if... You can get rid of planning time for most of your queries if you use prepared statements.

And it's easy to check with the same P bench saying capital M prepared. And you'll see there is no effect. Like you can, as many as you have. And planning time is very small because there is no planning time for most queries. It's planned only once and then it's reused the same plan. Right? So, it's worth saying that this Monday, this week, PgBench, finally, like, I waited, like, honestly waited, PgBouncer, sorry, not, PgBench is for benchmarks, yeah, PgBouncer, which is still the most popular, I think, pooler,

[00:21:06] Michael: Yeah.

[00:21:07] Nikolay: right, although there is competition growing, last years, finally PgBouncer got support of Prepare statements for transaction pool mode.

Transaction pool mode is the most interesting because it like has most benefits. We discussed poolers as well in our... Another episode, and, uh, people who were using PgBouncer, a lot of people, could not use prepared statements, and now they can, since this Monday, and this is a big, big feature, and I'm glad that it happened, finally, so, I mean, I wanted to say, maybe not 15 years, but close to it, I waited for it.

Skype released PgBouncer maybe in 2007, 8 or something, like almost 15 years, definitely. So, I didn't want it, like, too much, but I obviously observed a lot of projects who would benefit from having this feature. So, it's great that it was released, definitely. And,

[00:22:11] Michael: How are you going to celebrate?

[00:22:13] Nikolay: benchmarking.

Benchmark

[00:22:17] Michael: Seems appropriate.

[00:22:20] Nikolay: party, you know. So, yeah, I wanted to thank all people who were involved. I don't remember all names, but I want to mention a couple of them. Geltef and Nemanil, sorry if I'm pronouncing it wrong, from Microsoft team, and also Konstantin Knizhnik, who created the latest version of the patch this spring, I think, from the Neon team.

And a couple of more folks who prototyped in this direction. And I forgot names, sorry, but obviously it's, multiple teams, multiple people and teams actually were involved and that's great to see that it's released. I, knowing the history of PgBouncer development, I actually expected that this patch won't be merged quite soon.

So that's why I said it's... It's super fast. It's not fast. Gerald mentioned that it was like 1. 5, like one and a half years of work. But from perspective of 15 years of waiting. So anyway, Each index adds penalty to planning time. It's probably very low, sub millisecond for small cases, but it might be already, like, millisecond, depends.

For primary key lookups, which are very frequent, for example, thousands per second, it's a lot of CPU time. And now, if you use PgBouncer before this Monday, you could not use it, now you just... Need to switch to prepared statements and enjoy that CPU goes down at least a little bit, right, and primary key lookups will be much faster now.

Give me your skeptical opinion about optimization on sub millisecond timing. It

[00:24:05] Michael: Yo, uh, well, firstly, I commend anybody that actually looks into this stuff and benchmarks it and shares the data. I think that's a step that people often miss when they're talking about generic advice around this. and I mean, I saw the numbers shared around, I think it's dangerous, isn't it? Like you share data and then people quote it without looking into it themselves.

So, even the pgBouncer stuff, they, they ran some synthetic workloads and showed it was between 15 percent 1. 5 to 250 percent faster on depending very much on the synthetic workload used. And I've seen some people then quote that saying it's minimum 15%. Improvement to all workloads. It's like, that's not what it means.

[00:24:49] Nikolay: Right.

[00:24:49] Michael: so I think it's like, it is a bit, dangerous to say that without checking for yourself. Like, check, do testing and do, do your own research. But, so that's kind of where I want to go with this degradation of query planning time. The graph that you've done is great. But the y axis is only, like, even at the 400 percent degraded query planning time, we're talking about a total of 0.

1 milliseconds on average of planning time.

[00:25:17] Nikolay: First of all, it's another, it's another mistake in that tweet. I'm fixing it in the final repository of these how tos. These problems will be solved. I hate when people, in this case, it was not people, it was Charlie Petty who wrote this graph with this advanced data analysis. It was a bunch of Python.

If you say don't do it, it won't do it anymore. So you shouldn't start y axis not from zero, right? It should be always from zero.

[00:25:46] Michael: I agree generally, but I wasn't even talking about that. I was talking about the other end of the y axis. Like, what's the maximum average planning time, even at 20

[00:25:56] Nikolay: Let me, let me explain from this point of view. For most, uh, quite heavy queries, the planning time is, like, the smallest fraction of everything, right? So, in optimization, even if you go to zero planning time, Probably won't be very noticeable, but that's why I keep talking, I keep talking about primary key lookups or other index scans or index only scans, when, uh, we have execution time is very small and planning time compared to execution time, it becomes noticeable, sometimes exceeding, and in our, like, if you have a table, and create 20 indexes, in our case, we showed that planning time is already four times more than execution time.

It's 80%. And if you remove it from the picture, it's 5x improvement. For, of course, we talk, still talk about, like, sub millisecond queries, but if you have 10, 000 of them per second... It's significant, right?

[00:27:00] Michael: Yeah. Oh, I completely agree. And I'm a big fan of this. Even in your example, you show that even with only one index, planning time is about double execution time. So we're talking about significant planning time here, regardless of how many indexes there are, if you're looking at in relative terms. I'm just saying for people that hear these numbers, check for yourself on your own data as well, because, um, We're talking about 0.

1, like you say sub millisecond, but it's a, it's not even that, it's sub a tenth of a millisecond. It's like another order of magnitude

[00:27:32] Nikolay: yeah, yeah, of course. Because in our case, this case was a very small table, again, 10 million rows, it's nothing. 10 million rows, it's like, these days, it's small.

[00:27:43] Michael: so on to more like specific things, I, I would be interested, I don't see many people, like I do see some duplicate indexes, but what's the, what's the same effect on, of having 20 indexes when they're not all on the primary key? Like my understanding is that this was, it's affected planning time because the planner

[00:27:59] Nikolay: It's a good question, we should check.

[00:28:02] Michael: But that, so again, like, most people don't have that kind of, I've got 20 indexes on the table, they don't have them all on the

[00:28:07] Nikolay: Good question, we will double check this path as well. Like, indexing different columns, which, like, for a planner it should be obvious that checking those indexes doesn't matter.

[00:28:18] Michael: Yeah.

[00:28:18] Nikolay: Doesn't make sense, I mean, doesn't make sense. But who knows? If... Planner has this optimization, let's check. I wanted to say, there's another test Vitaly Kukharik created yesterday, actually, checking this new feature in PagerBouncer, and CPU load was like, just, again, moving from 1 index to 20 index, And we've prepared statements, fruit page bounce, everything.

CPU load went from 80 percent to 40%, so two times less. Just switching to prepared fruit page bounce. So this is like, but again, it's a very specific case. Primary key lookups, indexes, extra indexes on the same column. Real case can be very, very different from it.

[00:29:04] Michael: And then, one more thing that you don't, in fact we should discuss here that where the graph changes shape a little bit. Um, but before we do, another, like, thing I've realized we haven't talked about is a case that does hit people pretty hard, which is too many indexes plus partitioning with, uh, dodgy

[00:29:24] Nikolay: let's switch to final topic, which is like, should be like a cherry on the top of this cake.

[00:29:31] Michael: Yeah.

[00:29:31] Nikolay: Uh, when you reach, like, for this particular case, simple, query, one table, primary key lookup, you just add extra indexes. No other tables are involved. The thing is that when you reach, uh, index 15 and add index number 16, uh, degradation It's much worse, but only if you have multiple session experiment, I mean, if, if like it's, if just one session, you won't notice it probably, but in concurrent environment, when many sessions around this primary key lookup, they, at...

When you add index number 16 and number of tables and indexes all together exceeds 16, this threshold hard coded in source code. In this case, you start observing additional degradation from, log manager behavior. Because, and it can be observed as a weight event lightweight log, weight event type is lightweight lw log, and the weight event itself is called log manager.

So, log manager starts... , doing some additional work because it, the thing is that when you, just run some select this select is going to acquire a log on, on the table and all its indexes, even if they don't participate. In the plan. You don't need them, but still Postgres needs to acquire access share log to block possible DDL, basically, right?

And it will be released only at the end of transaction. So, to acquire a log, log manager needs to deal with slots, basically. And when it's doing it, it needs this lightweight log. So your transactions, select transactions, start competing for this log acquisition. I mean, they're not blocking each other, but acquisition, like, takes some time additional.

Why it's 16? Like, before that, there is a thing called FastPath. And there is another algorithm, which is not, like, not full fledged checking. It's faster. But once you need more than 16, access share log, this is the lowest heavy log, access

[00:31:54] Michael: And is this, like, a hard coded number, or what, where's this 16 coming from?

[00:31:58] Nikolay: It's, uh, it's a hard coded number, it's called FP log slots per backend, FastPath. Log slots per backend is 16, quite old thing, and, uh, and that's it. So, even if you have, like, 100 or more vCPUs, terabyte of memory, this hard coded number can hit you badly. And, one, why I switch to this topic? Because partitioning can be involved here.

Because partitioning is multiplier of problem. If you, for example, have a simple SELECT to SUM table... And you have 15 indexes. Relation, I mean table. Relation means, in Postgres terminology, relation is both tables and indexes. It's called relation. Okay, so table plus 16 relations. All logs will be fastpath.

By the way, a comment. it might look surprising that we need to log all... Indexes, even if they don't need them. It's so. But it's so, yeah. But I'm curious if it would be implemented differently, like virtual locking for indexes. If a table, like, indexes do not live without tables. If table is locked, we could consider indexes also locked.

Like, it's additional checking some pieces of code, but maybe checking this is less than two. Problem then, dealing with this FastPath versus not FastPath.

[00:33:28] Michael: Good question.

[00:33:29] Nikolay: Oh, by the way, FastPath can be seen by hands, you just select from pgLogs, and there is a column called FastPath. But you, when you checking this, you need to do it in another session, knowing PID.

Process ID of your main session. You just don't close transaction, you can, if you just say, begin, select something, and keep transaction open, and then from different session you look at pgLogs, knowing the process ID of the first session, and you will see, all the logs you will see, relational level logs.

Heavy locks, and you'll see which of them, FastPath which are not, and FastPath 16 maximum. So this, this is hardcoded, and uh, there is a discussion on Hacker News, I think there are several discussions, the latest one I started, and originally like, it was maybe no, now Thomas Wondra proposed some patches, we are going to explore them.

So to increase this threshold to 64, for example. Or, there are complexities,

[00:34:34] Michael: did you say Hacker News, or do you mean Hacker's mailing list?

[00:34:37] Nikolay: sorry, yes, hackers, pgSQL,

[00:34:42] Michael: I was thinking

[00:34:43] Nikolay: no, no, no, no, no, no, such things are not discussed on

Hacker News,

yeah, sorry, pgBench versus pgBouncer

[00:34:49] Michael: That makes way more sense. Let's link them up. I'll do that.

[00:34:53] Nikolay: and Hacker News versus pgSQL hackers, of course, yeah, sounds similar, sorry for mess, so there are proposed patches. to raise this threshold and turned out it's not so simple as I thought originally. Like one line, like you just change to 64. No, no, no. There is a big discussion of complexities of this move.

And, uh, I think it will be great to, to move this forward and, uh, I plan to test it, but back to like how, it feels. I see several cases when it was very bad situation, when, uh, this. Uh, lightweight log wait event came and many sessions are starting to clashing on this, right? And we quickly reach a number of active sessions, like reaching Number of CPUs we have, vCPUs we have, cores we have, and exceeding it, and it puts server on knees.

I mean, we have brief outage, like, lasting from several seconds to maybe a minute. It's sometimes, it's very, it's a bad problem, actually. And, uh, it's also a hard problem to understand. But the main, uh, how, how it looks like. You see this lightweight, lightweight lock. Log manager in wait events. So, unfortunately, there is nothing you can do here, except reducing number of relations.

For example, removing some indexes, or reducing frequency. It happens when frequency is high. If frequency is low, you won't notice. You can have hundred indexes, and they will not click. There is no contention observed, but if you have many indexes and, for example, you partitioned and some queries don't have partition pruning involved in plans, meaning that they need to deal with multiple partitions maybe, in the worst case, all of them, and each partition has, for example, 15 indexes. Obviously, you have a lot of non fast path locking involved, and yeah, and it slows down lock manager behavior, and you see this spike of lightweight, lock manager lock contention.

[00:37:10] Michael: Yeah. Well, and it can be even like, even fewer indexes can be a problem. As soon as you've got, let's say like 10 partitions and you have, you've

[00:37:20] Nikolay: indexes, ten partitions, right?

[00:37:22] Michael: or even like two indexes. Yeah. It becomes an issue really quickly. And I think this is one of those. Like, confounding factors where it's not even an issue of too many indexes, it's, it, the main issue is you've forgotten a couple of queries that don't have the partition key in the, like, a couple of frequent queries that don't have the partition key

[00:37:42] Nikolay: Ah, they should be frequent. If they're not frequent, it's okay.

[00:37:45] Michael: exactly, and it's that, and then that hits this issue, and then that kind of, like, uh, cascades into other issues, so, yeah, a really good point about this, I think this is

[00:37:56] Nikolay: Right. So, so, uh, yeah, so partitioning, but in general, when you, when partition pruning in plants is... Involved, at least in Postgres 14 and higher, if you deal with, if plan is working only with single partition, you won't see logs for other partitions. I just double check manually for safety, you know. It's so.

I checked it a couple of days ago additionally. So, it means that you can have 10 indexes and work with... By the way, if you join with other tables... Again, more indexes in the book. Like, joins also multiply this problem, right? So, if you, like, I'm talking about high frequency queries, like thousands of per second, and the lack of, CPU power we have here, because if you talk about 500 per second, and you have 128 vCPUs, or 244, like, modern AMD EPYC, for example, generation 3, right?

In this case, Probably you won't notice this problem, but if you start noticing, and it can be noticed, early stage, you just see wait events popping up, you don't have outage yet, but already these wait events, you should be very careful, consider index optimization, reduce indexes, check the plans, try to reduce frequency, and what else?

Ah, if it's primary key lookups, back to our situation with... Planning time. If you see planning time is significant here, it's time to consider, again, if you have PgBouncer, time to consider prepare statements and reduce the duration these logs are being held, because they are released only at the end of transaction.

So, right. So, yeah, this is probably, by the way, one of the biggest performance problems Postgres has right now, which is hard coded, probably it will be improved, and, of course, only projects with high frequency Queries experience it, but it's, it's an interesting problem. I consider it as similar as sub transaction issues I, I studied like a few years ago.

So, yeah. Oh, by the way, also, , worth noting that, uh, the RDS documentation has very good Weight event, pages, like, for, for, like, most popular weight events, they have a whole page. It's much better than official documentation, because it talks also about use cases, what to do, how to diagnose, and so on. And for this particular lightweight log, log manager, they have very good page as well. So, worth checking.

[00:40:42] Michael: I'll put it in the, in the show notes.

[00:40:44] Nikolay: Good. So... That's it, I guess, right? So, like, don't allow over indexing. Under indexing, it's next time.

[00:40:55] Michael: Yeah. Probably most of you listening are still in the, in the under, I would guess most of you are in the under index stage, but the ones that are in the over index, I'm sure this would be,

[00:41:05] Nikolay: should have a vote somehow, like, do you feel over indexed or under indexed? I can do it on my Twitter and

[00:41:11] Michael: Yeah. Great.

[00:41:13] Nikolay: Okay, good. Thank you, all listeners.

[00:41:17] Michael: Yeah. Thanks so much, Nikolay.

See you next week.

[00:41:21] Nikolay: usual, if you reach this point, it's definitely, like, we ask you to share it with your Colleagues and friends who use Postgres and also, we also ask you to provide us some feedback and... Ideas for future topics. Maybe there are not many topics left. What's your feeling here? We covered everything already, right?

[00:41:45] Michael: Yeah. I'm finding the more topics we do, the more research I have to do for each one. We get into the ones where neither of us, either one of us knows it. Like, well actually this is a bad example because you've done part of this research as part of your work anyway. And yeah, I think we're getting having to dive deeper each for each new topic now or we're having to cover ones that we're less familiar with. So yeah, but I don't think we're anywhere near running out.

[00:42:10] Nikolay: Best Postgres library. Best Go library for working with Postgres. There are interesting nuances there.

[00:42:20] Michael: Honestly, and we've only like, if you look at the index types, we've talked a lot about bt indexes mostly today. I think we did one on print indexes, but we've never touched. I don't think

[00:42:29] Nikolay: Gene fast update. Yeah.

[00:42:31] Michael: I don't think so. Or I think we mentioned it, but not, yeah. Hash we haven't talked

[00:42:35] Nikolay: Okay. We have some fuel. Don't, don't write us. Okay. I will. Okay.

[00:42:43] Michael: take

care everyone. bye.