Oxide and Friends

Adam and the Oxide Friends follow Bryan on Mr. Nagle's Wild Ride as he investigates performance anomalies. Bryan used all manner of tool from gnuplot to DTrace-inspired bpftrace! If you have ever or plan to ever care about the latency of network-borne protocols, you won't want to miss this!

We've been hosting a live show weekly on Mondays at 5p for about an hour, and recording them all; here is the recording from October 2nd, 2023.

In addition to Bryan Cantrill and Adam Leventhal, speakers included Tom Lyon, James Tucker, Eliza Weisman, and Dan Ports.

Some of the topics we hit on, in the order that we hit them:

If we got something wrong or missed something, please file a PR! Our next show will likely be on Monday at 5p Pacific Time on our Discord server; stay tuned to our Mastodon feeds for details, or subscribe to this calendar. We'd love to have you join us, as we always love to hear from new speakers!

Creators & Guests

Host

Adam Leventhal

Host

Bryan Cantrill

What is Oxide and Friends?

Oxide hosts a weekly Discord show where we discuss a wide range of topics: computer history, startups, Oxide hardware bringup, and other topics du jour. These are the recordings in podcast form.
Join us live (usually Mondays at 5pm PT) https://discord.gg/gcQxNHAKCB
Subscribe to our calendar: https://calendar.google.com/calendar/ical/c_318925f4185aa71c4524d0d6127f31058c9e21f29f017d48a0fca6f564969cd0%40group.calendar.google.com/public/basic.ics

Speaker 1: 00:00

Hello, Brian.

Speaker 2: 00:01

Hello, Adam.

Speaker 1: 00:02

How you doing? Doing well. How are

Speaker 2: 00:04

you? Sorry.

Speaker 1: 00:05

Good. Couple of weeks off. Just remembering how to do all this stuff.

Speaker 2: 00:10

Yeah. Yeah. We had a a bit of a break in there. Sorry to, sorry folks who have become regulars. We had a our all oxide beat up, which is that was the time I thought.

Speaker 1: 00:22

That was great, man. It was great. And in particular, there are a couple of colleagues who I'd only seen directly, head on, you know, boxed in a window, and I got distracted by 2 of them. I I had a hard time recognize them.

Speaker 2: 00:38

Recognize them because you had just never seen, like, the back of

Speaker 1: 00:41

the see them in person. Yeah. I've never seen them 5 degrees off of center. Well and it's also like you can't go up to someone

Speaker 2: 00:47

to say, you know, I found your head shape to be surprising. I mean, your arm or whatever it is. Or I like, you are you are of much more average stature than I thought, or you are I mean, there's basically no way to enter into that without being

Speaker 1: 01:00

inappropriate. Weird. Agreed.

Speaker 2: 01:01

Very weird. Not necessarily inappropriate. More like weird. Weird. Yeah.

Speaker 2: 01:04

And I think it like, once you go into that, there's no way out of it too. It's like, why are you even here? Just, like, just from afar. You know?

Speaker 1: 01:11

I think that's right.

Speaker 2: 01:13

Of course, now it's an aux it's like, wait a minute. Who are they talk clearly,

Speaker 1: 01:17

who is the Like, is it my head shape? Am I weirdly tall or short?

Speaker 2: 01:22

I think we need to emphasize that it's it's everyone and no one. There you go. Sure. I mean I

Speaker 1: 01:28

mean, you can say that, but there are a couple I'm thinking of. No. That I would say that we do work with remarkably tall people. In some cases, alarmingly tall.

Speaker 2: 01:39

In some we would refer to tall people, but not all tall. Yeah. And it's, you know, it's it's all about them. I also feel Yeah. Am I the only one doing the following?

Speaker 2: 01:47

I'm about this is I I I'm to tell myself here. The maybe I am the only one doing this, and I shouldn't say this. The I feel that a, a video conferencingism that I have I'm not able to get myself to stop doing is the over overly emphatic thumbs up next to my cheek in a conversation.

Speaker 1: 02:09

You know what I mean? I do. And you you do that you do that in person now?

Speaker 2: 02:15

I I for sure have done that in person. And I done that in person in, like, not only in context, but I think in when you're talking to a colleague that you're used to seeing over, you know, over a video conference, it's like the you just that that thumb's ready to go. That that in emphatic cheerful thumbs up next to my cheekbone so you can see it and know that I'm agreeing with you Oh, for this video. Gotcha.

Speaker 1: 02:41

That that's better than, like, feeling the need to reach for a keyboard to have some sidebar conversation while other people are talking.

Speaker 2: 02:49

You know, that's another thing that's kind of interesting. Yeah. About, like, the, the the sidebar conversation ends up just becoming an important part of a meeting.

Speaker 1: 02:57

Yeah.

Speaker 2: 03:00

Totally. But that's not why

Speaker 1: 03:01

But that's not why we're here. Exactly. Ex as excited as we are.

Speaker 2: 03:05

As excited as we are. Yeah. I had this this tweet today that, about man. Yeah.

Speaker 1: 03:15

So how how did we get here? Could we keep How did you want

Speaker 2: 03:18

Where does this odyssey start? Yeah. So this odyssey starts. Well, I would like to say so this is not the first time as I was kinda reflecting back on debugging this problem, this is not the first time that this problem has inflicted itself upon me, which is to say, the lack of TCP No delay being set on a connection, resulting in the very unfortunate combination of delayed acts and Nagel's algorithm. We'll kind of get into all that here.

Speaker 2: 03:49

I think it might be the first time that I myself have debugged it all the way to root cause. I feel like somewhere along there, in other cases, I have either not been able to debug it to root cause or someone else has said, like, wait a minute. Have we checked this thing over here? And as a result, it has not made the this one seared my brain in a way that some of those other those those prior ones did not.

Speaker 1: 04:12

I mean, I can imagine that the data for a lot of these kinds of pathologies looks similar. Right? Like, if once you've seen it once, you start to look for it in the data and you see these hiccups. You're like, wait a minute.

Speaker 2: 04:27

Yeah. You know, it I feel that, like, that that's a good question because I feel like the data for this one looks slightly different. I mean, when I have seen this thing in other presentations, it's actually even harder. And this was not easy to debug at all.

Speaker 1: 04:42

This is really

Speaker 2: 04:43

really a mess. So, yeah, should we talk about, like, kinda how we got here?

Speaker 1: 04:47

Yeah. Yeah. So what what were you debugging?

Speaker 2: 04:51

So, the, I was debugging, we we had some IO performance that was we had some latency outliers that were, bad. Mhmm. And, we were trying to understand why in we had seen this in, we'd seen this both on customer sites, and then, fortunately, we had seen a very similar symptomatic presentation kind of here, at Oxide. We've got our own rack that we that we run our own workloads on. So, we I kinda seen in both places.

Speaker 2: 05:23

But, you know, it always makes me nervous because you're you're kind of reproducing symptoms, and we you and I both know that, reproducing symptoms is not necessarily the same as reproducing the annoying problem. And so one of the things that I feel that I went to I did go to earlier in this problem that I don't regret, is something that I've learned the hard way over the years, which is when looking at IO Pathologies in particular, it is really important to visualize every single IO. Do you know what I mean?

Speaker 1: 05:51

Yeah. Totally. You know, as opposed to aggregating them into buckets or looking at 1 minute or 32nd samples or whatever?

Speaker 2: 06:01

Yes. That you can see patterns, and we saw this even or back in the day in FishWorks. We saw some really wild patterns when we we we had this analytics facility that we had added to the NAS box that we were building, Detroit's based analytics facility, which was really interesting. And one of the things that we did there that was, I think, pretty novel at the time was the ability to get heat maps for IO latency. And then we would look at those over time, and there were some, like, wild patterns in there.

Speaker 2: 06:31

Do you remember we gotta dig up, like, Brendan's blog. Yeah. I was particularly I

Speaker 1: 06:36

was thinking oh, I was thinking of the pterodactyl.

Speaker 2: 06:39

The pterodactyl? Yeah. The, and it's the here. I've got, I've got the the link here. I can drop it in the Discord.

Speaker 2: 06:47

This is, latency art x marks the spot, and, I mean, this was, I feel, just wild. Let's see. I'm one of my alts here is gonna drop us into the Discord. So, hopefully, that that,

Speaker 1: 07:03

Yeah. Yeah. I've got the the icy lake there.

Speaker 2: 07:07

I've got a good Discord is going oh, there we are. Good. The so it and you look at just by looking at latency of operation over time. So you're putting basically time on the x axis and latency in some way, shape, or form on the y axis. And, in this case, we are this is actually a heat map showing you kind of the frequency at different latencies, which is easier than I mean, you can't look at kind of a line rate of thousands that's happening in a in a second.

Speaker 2: 07:36

You've gotta be able to aggregate them up a little bit and and and kind of quantize them a bit. And we would just see, like, absolutely crazy things. And we've got, we will I think he's got a link there to yeah. He's that that first link is the latency art is to that thing that we called rainbow pterodactyl.

Speaker 1: 07:55

Yeah. And you would just see

Speaker 2: 07:57

these, like, super subtle effects by looking at latency over time. And I definitely, when, I mean, as you know, Adam, I well, I either control time and space in my mind, or more likely, the gods like to taunt me. And whenever there's, like, a a an extended period of time where whenever I would give a presentation, the gods would pick the most ironic problem to send my way immediately after that presentation. So Okay. In particular, it's like, oh, did you just give a presentation on debugging in production?

Speaker 2: 08:32

Let us now send you a nightmare to debug in production. Let us know let let us now fill ourselves with mirth by sending you oh, did oh, I'm sorry. Did you just give a presentation on on pathologies and firmware? Let us now send you a new pathology and firmware that would that that delights us as we so in in particular, I gave this presentation called zebras all the way down on, latency outliers deep in in the firmware stack in particular, and then got absolutely duped, like, the next moment, again, because the gods, like, saw my presentation, immediately sent this my way, of this really gnarly IO performance issue that we only saw in 1 data center. And it was it it took me too long to actually just go visualize every IO.

Speaker 2: 09:18

And when I did visualize every IO, what I saw were really clear bands. And in particular, in this particular case, I would see this very clear band of, of effectively, IOs all being delayed until they were all clearly released from the firmware, and there's clearly a 27 100 millisecond timeout somewhere in that firmware. And this is when I learned that Toshiba made hard drives. I actually didn't know that Toshiba made hard

Speaker 3: 09:48

drives. Wait.

Speaker 2: 09:50

No. Oh my god. This is where I'm like I get to the point where I'm like, is this like a drive firmware issue? I mean, I had to kind of been debugging this assuming that it was a Postgres issue and or ZFS issue, because definitely both those 2 were definitely both on the scene

Speaker 1: 10:06

and just, like, going at it like a like, wet cats in

Speaker 2: 10:10

a bag. So it's, like, reasonable to assume it was one of those 2. But then as I was looking at the actual IO 8, so you can be off the charts, I'm like, maybe this is an actual, like, drive firmware issue. And I'm like, do we have this do we have, like, a different version of HGST firmware in this data center than we do in other data centers? And I pulled the drive information.

Speaker 2: 10:31

I'm like, does she make smart drives?

Speaker 1: 10:34

There was

Speaker 2: 10:34

just this moment of like, what?

Speaker 1: 10:38

There you go.

Speaker 2: 10:38

Well, and it's actually kind of part of, like, the Joker oxide origin story a little bit because, that I guess it makes oxide of being a super villain, but this is when I realized that Dell would just routinely stuff Toshiba drives. You like, you can't spec your drive with a Dell. And so the and Toshiba, these are 2 and a half inch drives at the time, and, basically, we learned that, Toshiba supplies effectively exclusively bought by Dell. It just and they're shoved into unsuspecting customers. Anyway, we discovered a lot of people from our issues.

Speaker 2: 11:13

But I one of the things in that whole thing that I regretted is, like, I really should have visualized each IO earlier, which I thought we had learned at Fishworks, and I just kind of, like, forgotten.

Speaker 1: 11:23

You create it's you're you're right. That's a great point. And I think we learned it even, like, with DTracer early days, just in the TTY, just in the terminal, where effects that are hard to express numerically, or at least hard for me to express numerically, become really clear just to the eye, where you see patterns that are hard to infer computationally.

Speaker 2: 11:46

Totally. Totally. Especially, you like looking at especially if you've got kind of, like, light banding, you know, they can be, you can see that visually really quickly that you can't necessarily see. It can be much harder to quantify. Yeah.

Speaker 2: 11:58

Visualizing data is is a big one. So we I wanted to go do this. And in particular, in this case, it's, like, a little bit challenging because, you know, in the the oxide rack, it's, like, what is actually presented to the user is a is a VM. I mean, we've got a VM. You've got virtual storage, virtual network, and really wanted to understand it from the guest's perspective of, like, let's start there.

Speaker 2: 12:20

What is the guest seeing in terms of latency outliers? And, this is where I did actually end up using b p f phrase for the first time in earnest.

Speaker 1: 12:29

Couldn't believe it. I couldn't believe it when you when you told me you're going there. Very impressed.

Speaker 2: 12:33

Well, I mean, be I did no. Because, like, I needed it. You're right. I mean, it's

Speaker 3: 12:36

like Sure.

Speaker 1: 12:37

Absolutely. You

Speaker 2: 12:37

know, it's it's it's an Ubuntu, you know, 2204 guest and, like and, actually, you

Speaker 1: 12:42

know, this is one of

Speaker 2: 12:43

those where it's, like, I actually don't wanna be vindicated right now. I actually want b p f trace to be, like, unequivocally awesome. I do not need to be right about anything right now, and I would just love it to be delightful.

Speaker 1: 12:54

Should we explain what b p f trace is? I realized that, you know, friends of the show may be familiar.

Speaker 2: 13:00

Sure.

Speaker 1: 13:00

But not sure. So, we talked at length and then more and then more length about DTrace a couple weeks ago.

Speaker 2: 13:09

And is there anything yet not enough. We

Speaker 1: 13:12

And yet it turns out there were still more like, we didn't even talk about data visualization or fishworks analytics or whatever I'm realizing.

Speaker 2: 13:17

Did you have, like, that after that episode? We're like, oh, man. There's so much stuff we didn't talk about. Like, I can always see that in an episode that's 2 hours long, you jerk.

Speaker 1: 13:24

No. My family was like, oh, you're still here? So, no, I I didn't have time to think about what I had I had forgotten to mention. But, the but so, eBPF extended Berkeley Packet Filter is this, you know, facility in Linux that has been extended even farther and and, does dynamic tracing in Linux. Is that a overly friendly way of saying it?

Speaker 1: 13:49

I don't know. I mean, the one in particular,

Speaker 2: 13:51

it has this facility called, BCC. Yeah. There were people that would complain about the usability of DTrace, and to those people, I sentenced them to go for for their crimes. They they must go use BCC. I don't something that like, the complaints about usability about dtrace kinda went away after BCC.

Speaker 2: 14:11

Yeah.

Speaker 1: 14:12

And and I also said that, like, EBPF and BCC and these other things actually can do lots, lots more. You know, in particular, DTrace intentionally puts a lot of constraints on what you can do using DTrace in terms of the safety of the system, mutability of the system. And that, in some cases, is layered on after the fact in eBPF and sometimes not. So, you know, different design centers, different goals.

Speaker 2: 14:40

Totally. Absolutely different design centers, different goals.

Speaker 1: 14:43

I mean, it it is a

Speaker 2: 14:44

packet filter. Is that kinda how it started? So it's, like, definitely different design centers, different calls for sure. And, usability, not one of those calls.

Speaker 1: 14:53

Devontability, not one of those

Speaker 2: 14:54

And so BPF trace was built as a much more DTrace like front end on VCC. I I do give them credit for, it it it is frustrating to me when people kind of, get very inspired by DTrace, but don't mention DTrace. It's, like, if you're very inspired like, DTrace is very inspired by Oc. I would like to think that we have mentioned Oc extensively, that we have not we are not pretending that we are not borrowing good abstractions from Och. And creates to their defense In their defense, it's pretty clear that, like, hey, this is Detroit's inspired.

Speaker 2: 15:30

Hey. Great. Again, I just want this thing to work that, like, really, that sounds terrific, But it unfortunately is it's not it's not really great. And in particular, it's just you know what it is? It's very revealing.

Speaker 2: 15:44

We'd said this last time too. The the the just the origins are different. And when we designed Detroit's, it is really designed to help you comprehend the system, and we used it a lot to debug the system. And it shows because, like, the stuff that we built the one hand, like, the stuff that we built on those things you need. On the other hand, the the kind of the constraints that we had are the constraints that we ourselves felt as we were solving this problem.

Speaker 2: 16:12

So the absolute safety constraint, for sure. Mhmm. You cannot instrument this. We talked about that last time. But one that we did not talk about, because, I guess, I didn't think it needed to be belabor, but, apparently, it does.

Speaker 2: 16:26

Any dynamic instrumentation facility that can go into an arbitrarily fragile context is going to have modes in which it can drop data, where you wanna go, like, I wanna instrument the system in this way. It's, like, sorry, we actually don't have a buffer to put it in, and, we can't allocate memory in this context. So, like, we're gonna have to drop this data. I mean, that's just like I feel like that's an intractable part of the problem. Yeah.

Speaker 1: 16:50

I mean, I I think you or you need to somehow pause the system. Right? The the it's it's one of the other. You can't, you know, if you don't have a place to put it, you gotta choose.

Speaker 2: 17:00

You don't have a place to put it, you gotta choose. And one of the things about Detroit is that we did extremely early because this is what I wanted, and we wanted out of our debugging facility is, it it's obviously, you are gonna have these situations. When you have them, they always need to be very apparent to the person using the software. So you've got you cannot drop data silently, And you have to indicate, I have dropped data, and this is why. I have dropped data, and it is because I am out of dynamic variable space, or I'm out of buffer space, or I'm out of aggregation space, or I I have run out of one of these things that needs to be statically sized a priori.

Speaker 2: 17:42

I I have run into one of those limits, and here's the limit I've run into, and here's the way, by the way, that you can you can tune the system such that I am less likely to have this happen. That to me is, like, super important. I mean, that's, like, of course, the first thing we did. And I

Speaker 1: 17:56

think incredibly important. I think one mistake maybe we made is we then said, these are all the things that happened. But I don't know. Here's here's what I think is the data. So I don't know.

Speaker 1: 18:06

Make of it what you will, even though it's probably all wrong.

Speaker 2: 18:10

It well and that's it. It's like it's like when you are you know, once you've had certain kinds of drops, you've effectively lost data integrity as if you think of this, even though it's not Turing complete when you're writing a a a teacher script, it there is, like, data integrity that's involved. And in particular, if you want to trace every IO, you wanna know the latency of every IO. You're gonna have to record some datum in an exo in an auxiliary space when that IO starts. And when that IO is done, you're gonna have to compare the timestamp when this thing started.

Speaker 2: 18:39

Assuming that the system doesn't keep this time stamp on its own, which it doesn't in Linux, and it doesn't on most other systems as well. It's like they don't we don't keep the time stamp because we don't necessarily need to all the time. So you want dynamic interpretation for that. And the the c is really basic, but then it becomes really, really important that if I cannot execute one end of that, I need to be aware of that because I've lost coherence. Because it means I'm gonna see the end of an IO that I never saw the start of, or I'm gonna see the start of an IO that I never see the end of.

Speaker 1: 19:10

Yeah. And and then your your sample will have some fraction of the data, but you have no idea of the bias associated with that.

Speaker 2: 19:17

That's right. And this is one of these things that's like you know, this is not like a DTrace interface issue. This is this is kinda deeper than that, and I've spent a lot of time just beating my head against

Speaker 1: 19:31

So this is in in b p f trace, do you you get data drops without notification?

Speaker 2: 19:37

Yes. That might be do this. Yeah.

Speaker 1: 19:40

Okay. Sorry. I'm sorry that happened to you.

Speaker 2: 19:43

Are you? You don't sound like that. You don't I mean, as opposed

Speaker 1: 19:48

to as opposed to me, just to be clear.

Speaker 2: 19:50

That's right. That's right. And and this is one of these things that, like, also the other thing that is, like, super challenging about this particular issue is I'm also I'm instrumenting, like, I'm instrumenting Linux. I'm instrumenting the guest. And you when you have one of these things, I mean, honestly, my first thought, especially kinda coming out of of DTrace is like, my first thought is I am not instrumenting the system correctly.

Speaker 2: 20:12

Right? Like, where there there's a code path where an IO can complete that I'm not instrumenting, which is a kind of a reasonable first assumption.

Speaker 1: 20:20

Totally.

Speaker 2: 20:21

And you spent a lot of time, like, and it's it's kind of impossible to know. In fact, a part of me, like, still doesn't totally know. The only reason I was able to convince myself that that's not what's happening is because when I was when I was able to write much simpler things, I was able to I I was able to prove to myself that, no, every single IO is is going through these 2 code paths. This code path to start it. This code path to end.

Speaker 2: 20:44

And we, in fact, are silently dropping a bunch of these on the floor, which is just whatever. It's

Speaker 1: 20:50

I have a very country mouse question. In Yeah. In, obviously, in Illumos, Helios, Solaris, there's the IO start and IO done probes. Like, we've taken the time to annotate where in the code IOs occur.

Speaker 2: 21:05

Why why are we doing this? Yes. Go ahead. Do it.

Speaker 1: 21:08

Is that also true? I mean, is it true in b b

Speaker 2: 21:12

okay. You know,

Speaker 1: 21:14

just ask him asking for, I guess, myself in the future.

Speaker 2: 21:18

It could be worse. It is. God. I'm already, like, I'm already bargaining here. It it it it it Stockholm syndrome for sure.

Speaker 2: 21:28

No. I mean, obviously, no. They don't

Speaker 1: 21:30

it probably used Stockholm syndrome because

Speaker 2: 21:31

I just want this thing to work. Like, I actually Yeah.

Speaker 1: 21:33

Yeah.

Speaker 2: 21:34

No. They did not have a and and even worse than that, like, on in in some cases, it's it's the the the name of the routine is named one way. And in some, like, in in different versions, it will be I mean, you're able to you're basically using their equivalent of FPT, which also, by the way, doesn't allow you to instrument on function returning to an instrument or function entries like a bad strumming.

Speaker 1: 21:59

Wait. Function boundary tracing as long as the boundary is the entry?

Speaker 2: 22:03

As long as boundary is the entry. Well, they don't call it. I mean, they're defensive. They don't call it FPT.

Speaker 1: 22:06

Okay. Okay. Fair

Speaker 2: 22:08

enough. The the r k funk. But the, so you've gotta, like, instrument, a bunch of things that don't necessarily exist. And then I found that that, the other sorry. We're here.

Speaker 2: 22:20

I we we weren't yeah. I guess well, sorry. We'll get the we'll get the niggle eventually. But the the so the other thing that is that is super frustrating is what is the coherence of the instrumentation with respect to the start of the instrumentation? So in particular, if you instrument the the the the same point of instrumentation multiple times in DRIS, you you got which is to say you effectively have multiple clauses that are instrumenting that same point, we guarantee that you're not we don't, like, just start executing the 3rd clause.

Speaker 2: 22:51

You know what I mean? Like, if you you and it's very clear that there's much more of, like, a a rolling start to BCC. I mean, they're like, well

Speaker 1: 23:01

You mean, like, like sort of if you have 3 probes attached or 3 statements or whatever attached to this Yeah. Probe, sort of the 3rd and then the first, you know, the 3rd might execute absent the first and second or whatever. Yeah. And when you know, in d choice, we have

Speaker 2: 23:15

this this important belief that we're gonna get everything set up and everything is ready to go, and then there's one master switch that we're just like, now go.

Speaker 1: 23:23

Right.

Speaker 2: 23:24

I think you can say with confidence that there's not the safe design approach to PCC. We're just like, oh, throws throw some instrumentation at it. Throw some more instrumentation at it. Throw some more instrumentation at it. There was a result.

Speaker 2: 23:34

Like, your probe kind of, like or your entire, like, BPS trace program kind of flickers on where, you know, some enabling you know, and then you got an enabling. Well, it'll start firing, then another enabling start firing, then another enabling will start firing. That clearly happens over a big span of, you know, hundreds of microseconds, milliseconds, or tens of milliseconds. So I don't know. Good luck, everybody.

Speaker 2: 23:56

Good luck. I

Speaker 1: 23:57

mean, that does sound brutal.

Speaker 2: 23:59

It's hard. It just makes it it's just one of these things where and I think that this is, like, kind of my a as I was kinda digging into the issues, BBF trace issues and, you know, kinda again, not in a place I don't even wanna be like. I don't even wanna be right about this right now. I'm actually happy. I just just wanna just wanna use this thing.

Speaker 2: 24:13

I'm really just trying to debug some other problem. Like, I know that is not the problem right now. I know that.

Speaker 1: 24:19

So, 2 two questions on this on this hero's journey. First, why did you think about, measuring it from the perspective of the other side? You know, from, you know, from ProProProless, from our virtual

Speaker 2: 24:34

yeah. No.

Speaker 1: 24:34

I you know, I No. No.

Speaker 2: 24:35

Part of me from ProProless.

Speaker 1: 24:36

Like, from the from the VMM, like, from the, you you know, the other side of the the virtual machine boundary.

Speaker 2: 24:42

So the well, the reason I really wanted I so this was a a customer was was seeing this and really wanted and I also just felt like this is something that is gonna happen that that we would expect to happen over time where what you actually have is kind of the the the customer's viewpoint. They they the VM's viewpoint of networking performance, IO performance, what have you. And I just felt like this is kind of an important perspective to get, because that's ultimately the perspective that, like, matters to the the VM is, like, what I am seeing. And I think it's kind of an important ground truth to go establish. So

Speaker 1: 25:19

Yeah. And great muscle mass to for us to build as oxide to understand, you know, how those virtual machines are operating.

Speaker 2: 25:27

Yeah. Exactly. So I felt like it was that and I I stand by that. I think that that's actually I did I did I think it was good. So we, I so put together this this b p f trace script that and then I'll there's a lot of just that there okay.

Speaker 2: 25:47

Sorry to do this. The begin and end probes also don't work the way you would expect. They the beginning the beginning and end probes, they try to get super clever, and they actually install a user level probe on themselves on the the BPF trace process.

Speaker 1: 26:05

Uh-huh. But

Speaker 2: 26:06

then bpf trace has been stripped by a bunch of distros, so you can't actually do so if you run bpf trace minus end begin or whatever, you get an error message that just is makes absolutely no sense. It's it's something I've been out like, you know, I'm I'm I'm missing a begin trigger. It's like, sounds like a you problem. Why are what? Anyway, whatever.

Speaker 2: 26:27

That's exciting. I I I digress. I digress.

Speaker 1: 26:30

Yes.

Speaker 2: 26:31

The So you got some data? Got some data and accepted that this data was gonna be somewhat lossy, which is Okay. Frustrating to accept, but whatever. And, and and then did some post processing. Actually, it was great, to do post processing of that then in Rust, which I actually like, Rust is increasingly you could say with like, for something like this where I'm, like, post processing data,

Speaker 1: 26:56

you know, I would say

Speaker 2: 26:56

that, like, shell and awk has been the kind of my my the thing I've I've reached for over my career, e like, even in the last couple years. And it's not increasingly, I'm, like, reaching to rust for these kinds of things that are

Speaker 4: 27:08

Yeah.

Speaker 2: 27:09

You you know, much kinda quicker, almost throwaway work, but it's it's just much faster to do it rust. Well, I

Speaker 1: 27:15

I feel like agreed. And, it's much faster because I don't have to do it twice. Like, I I think my usual workflow is to, like, fuck it up once with Python or Ok or Bash or whatever, and maybe, like, spend a much time figuring out why. Whereas, I feel like I can get it right the first time more often with Rust.

Speaker 2: 27:35

Yeah. Get it right the the first time. And then it would then when you wanna go build something that's like, a little more sophisticated like, in particular, I do have this issue of, like, hey. I actually want to just after an IO has been outstanding for some number of 500 milliseconds, we're gonna treat that as, like, whoops. We have a thanks, BPS trace.

Speaker 2: 27:52

That's just a that's a drop, effectively, and I just wanna, like, discard that data point. And that's kind of a pain in the eye. I mean, yeah, you can, obviously, you could do a knock and shell, but it's just it gets, it's it it gets grottier

Speaker 1: 28:03

and grottier.

Speaker 2: 28:04

So Totally. The, but so, anyway, so we we get this data and visualize the data. And in particular, there is some real clear banding at a 100 milliseconds. And

Speaker 3: 28:20

I'm like, what is the okay.

Speaker 2: 28:21

So we're seeing and in particular, and I can kinda break this up by by different kinds of operations. We're seeing this in particular on flush operation. So when you've got a this is our, we we've got our, EBS like service, our virtual storage service, and, it is presenting a of a virtual volume, virtual disk to the guest. And that guest can do a a read operation or write operation, But we we tell the guest that, hey. By the way, there is effectively nonvolatile caching here, or there's volatile caching here.

Speaker 2: 28:56

So there is you you cannot assume when you do a write to me, it's not necessarily synchronous. It's not necessarily nonvolatile. You need to come back later and do a flush if you want.

Speaker 1: 29:06

Right.

Speaker 2: 29:06

If you wanna make sure that so this is the the way it is. And so we were in particular, these flush operations would have the this banding at a 100 milliseconds, and it wasn't but it wasn't all of them. It wasn't even, like, necessarily many of them. It was, like and we finally got a workload that had, like, a third of them were at these 100 millisecond hours. The problem is, like, when when everyone else is clustered at, like, 4 milliseconds, you know, a 100 milliseconds is a long time.

Speaker 2: 29:35

That's a long time

Speaker 1: 29:36

That's right. That that outlier has an outsized effect for sure.

Speaker 4: 29:39

An outlier

Speaker 2: 29:39

Outlier has an outsized effect. And, you know, I'm a I'm a little bit embarrassed about how, you know, long it took to well, I mean, it was a very long day at that point. But, you know, kind of going through I just, like, I wanted to debug this from first principles. So really kind of started at you know, unfortunately, we've got this point, you know, we've got this same behavior. We've now got reproduced kind of inside an oxide rack.

Speaker 2: 30:06

So now I've got the ability to actually instrument the system from kind of an arbitrary point, right, not just from the guest. So we can begin to see, like, okay. So where where do we see these 100 millisecond outliers? And because my my assumption was that there was something happening in crucible in the storage layer that was inducing these outliers. I don't know.

Speaker 2: 30:25

Feels reasonable. Right? I don't know. You don't know. Knowing how this ends, you don't have to tell me that.

Speaker 2: 30:29

But the, the other thing I would say is that we knew we didn't see it on the bench. We knew we didn't see it. If the network is is out of the picture, we knew we didn't see it. And then just say, like, go ahead and see the numbers look good. And then

Speaker 1: 30:39

Like what we're doing, loop back test. Mhmm. That's

Speaker 2: 30:41

right. When when you when the volume and you're you're you're writing everything in triplicate. So when all 3, of those targets are on the same, you know, what? Same compute slide. We were we knew we weren't seeing it.

Speaker 2: 30:53

So Yeah.

Speaker 1: 30:53

It was

Speaker 2: 30:54

only when going over the network we're seeing these outliers. And, you know, just trying to, like, chew through every layer, And this is actually where, USDT was huge, Adam. It's cool. Because, Alan Alan Hansen has been leading up the the storage efforts. He's done a great job of adding USGT probes to Crucible.

Speaker 2: 31:15

So, Crucible is loaded with USCT probes. And in particular, then you you've got there are identifiers that are relevant to Crucible that are in each of these operations. So you have these job identifiers, which was actually really important to begin to correlate activity that you're seeing on one side versus activity you're seeing on the other side. But Cool.

Speaker 1: 31:37

Got it. So you can look on both both client and server effectively with that same ID?

Speaker 2: 31:41

Both client and server then but it's actually really hard to, like, actually get all this up to line up. Right? Because you are kinda relying on I mean, it's bit of a time problem. Right? And kind of relying on wall timestamp and then the divergence of system clocks and, you know, you're trying to to kinda roughly true this up to figure out where it's coming from.

Speaker 2: 32:03

And what we were what I was seeing is that you we would we would when we would see these outliers when we were seeing them frequently enough that I could really begin to, like, understand when we and then they were, like, cliffy at a 100 milliseconds. You know, once we've seen this, I can stop gathering data or I can I can, look at what I've got? And the the kind of the the next step was, like, alright. So when did we because my kind of assumption at that point was, like, this is, there most most some reason why this isn't going out on the wire on the remote end, which is act it turns out was, like, very close, but also, very far away because I ended up convincing myself too early that it was going out on the wire, And I went back to kind of the client end of this and spent a bunch of time and a bunch of detrace implications looking at and this is where, like, the TCP and IP probes were very helpful in correlating that with probes at the Cisco layer and the USDT. So you can begin to, like, get this whole picture of what's going on.

Speaker 2: 33:00

And what you actually see is that we actually like, we're just not getting an acknowledgment of this. We we sent a flush request, and then we don't see an acknowledgment for a 100 milliseconds. Ultimately, it's what it boils down to. And we but and we've now know that the flush definitely occurred. You can see the flush happening at the other end.

Speaker 2: 33:20

It's happening super promptly. It's happening in whatever it was, you know, 3 to 6 milliseconds. And you can see it being sent, and then it's it but it's not actually showing up. At that point, you're like, oh, wait a minute. Is this being naggled?

Speaker 2: 33:38

And so, Adam, if you had to

Speaker 1: 33:39

deal with, I mean, this is

Speaker 2: 33:40

I think this is a little unfair to John Nagel. Have you No.

Speaker 1: 33:45

I you know what? I've I, after spending some time looking, I I don't think I've been victim to Nagel's algorithm in the past, as far as I know. Yeah. So Which I know I it's a privilege statement. I realize.

Speaker 2: 33:59

It is. And I did. I I'm I'm I'm glad you raised that. Yeah. So the the the idea is that, hey, If I have got a small packet and I am already waiting to hear from an an back from the other side.

Speaker 2: 34:16

Like, I've not I I'm kinda pending. I'm waiting to hear from the other side.

Speaker 3: 34:20

Yeah.

Speaker 2: 34:21

That might be an indicator that I am flooding, the network with these really small packets. So instead of doing that, let's just hang out a little bit here. Let's leave this small packet here until we hear that act back from the other side. And, by the way, folks that are, if you are and I'm sorry. I've not been looking at request to speak of.

Speaker 2: 34:43

Folks have been asking, but if they are if folks have got expertise on this, and I'm wrong, by the way, feel free to jump up here. And, if you especially if folks have actually debugged this, I definitely wanna wanna hear from folks. So when this is the idea of of Nagel's algorithm is we wanna be kind to the network, which is like a actually a very, you know, and I think I saw Tom Lyon here earlier. And, you know, I I feel like the the great Internet collapse of 1986 I mean, we talked about that here.

Speaker 1: 35:19

I don't think so. No. I don't think so.

Speaker 2: 35:22

And is is Tom here? Tom I I'm not sure if Tom can, can talk about the great Internet collapse, but basically, the the Internet as designed was going to collapse, because it was, of congestion. The Internet was dying of congestion. Yeah. So, Tom, can you can you take us back to the glory days of 1986 and the great inter I mean, do I have the history even approximately correct?

Speaker 3: 35:48

I'm not entirely sure. I remember Sun was not actually connected to the Internet for quite a while. So we were watching from the sidelines as a lot of the stuff was happening. Oh, interesting. Nagel's algorithm was definitely important, but nowhere near as important as Van Jacobsen's work, which I think is more like 88.

Speaker 2: 36:11

And right. I mean, yeah, just to be clear, Nagel's algorithm is somewhat unrelated to or is unrelated to the great Internet collapse. The

Speaker 3: 36:26

the, you know, the exponential back off

Speaker 1: 36:30

Mhmm.

Speaker 3: 36:30

If you lose the bracket, that kind of thing. Yeah. Yeah. That that's what really saved the Internet.

Speaker 2: 36:35

That saved the Internet. Right. And so but there was this kind of idea, I think, especially correct again, correct me if I'm wrong, Tom, but there's this kind of idea that that we actually needed to consider the network as part of this kind of design criteria of the broader system. That it it that p you cannot allow, individual actors to act greedily because they will destroy the other or destroy the network. It's kind of the

Speaker 3: 37:02

Plus plus the endpoints were very different. Right? There there are lots of, terminal aggregators, you know, milking machines. And so you get one share screen from a milking machine. You've never heard that.

Speaker 1: 37:16

What's up?

Speaker 2: 37:17

No. What's the milking? I mean, the I've We're not talking about the yogurt.

Speaker 3: 37:22

A whole a whole a whole bunch of, serial lines coming into an Ethernet port.

Speaker 1: 37:28

Wow.

Speaker 3: 37:28

I don't know where it got that name, but

Speaker 1: 37:30

That's great.

Speaker 3: 37:33

But, you know, you get a a character in on a on a terminal line is what are you gonna do with it? Okay. You wrap it up in a pack and send it. Well, if everyone's doing that, then you you really need Nagel's algorithm because you just get a shit ton of small packets and and you have no idea when more characters are coming in from the serial lines.

Speaker 2: 37:53

Right. And and that's because you're just you're constantly send and and for a user, the latency bubble that you so in kind of the worst case, your the latency bubble that you're gonna see is the time until you get that from the other side, which doesn't feel like

Speaker 1: 38:13

it's like you know, it's like doesn't feel

Speaker 2: 38:14

like it's that long. Yeah. So what? Like, wait until you've heard an act. Like, just settle down.

Speaker 2: 38:20

And the fact that you haven't heard an act back indicates that you you may be suffering from congestion, which feels reasonable.

Speaker 1: 38:28

Talking about human scale. I mean, talking about a 100 milliseconds, or 40 milliseconds on Nagel's algorithm, I think. It's not, you know, not that noticeable for us people.

Speaker 3: 38:38

Yeah. Plus plus it was in the days when, you know, you're you're doing stuff over the wide area. It's it's a miracle things work at all. You're you're not gonna complain too much about the response.

Speaker 1: 38:48

Right. The fact that you get an at all is a miracle. Right?

Speaker 2: 38:51

Right. Right.

Speaker 1: 38:54

Yeah. And so I think

Speaker 2: 38:57

and, you know, all of this, like, might have might not be something we're talking about today had it not been for work that was happening in parallel on delayed acts. So kind of at this same moment, and I this is all coming from, by the way, a Hacker news post from John Nagle. And, man, I can't because, like, I mean, I don't think we're the only ones to talk about getting Nagled. Right? Do other people talk about getting Nagled?

Speaker 2: 39:24

And, like, I mean, if I'm John Nagel, I'm like, hey, dude. This kinda sucks that I might need to talk to you. I mean, that's like, what?

Speaker 3: 39:32

Yeah.

Speaker 2: 39:34

And I, although, Adam, thank you for I I really admire your discretion to not reveal that actually getting canceled is something that you and and everyone for me refers to. I mean, I'm I'm sure that

Speaker 1: 39:47

it's like, god, I got I got canceled again today. Oh, god.

Speaker 2: 39:49

Well, you

Speaker 1: 39:49

talk about getting Leventhal and you and I I do I

Speaker 2: 39:52

do know getting Leventhal.

Speaker 1: 39:54

Very different meanings of what that means, by the way.

Speaker 2: 39:57

That's true. That's true. So maybe you and you and John Nagel have got something that you we talked about.

Speaker 1: 40:02

Actually we go. Next time I see him.

Speaker 3: 40:04

You know, the you know, the delay in act thing, it it it manifests itself more in the modern age because people expect TCP to do lots of things that it wasn't really expected to do in the past. So, like, request response protocols were not really part of the design point.

Speaker 2: 40:24

Yeah. Yeah. This is this, Tom. This is a really important point that you're making. That and when you say not so a your a request response protocol being, like, I am going to I am effectively this is a remote procedure call effectively that I'm making.

Speaker 2: 40:38

This is a like, I am calling you, and by the way, until you reply to me, I have got nothing to do. And that's really as you say, that is not really the design point. But it is the way we there's a bunch of stuff on the network that actually does rely on that kind of behavior. Right. But

Speaker 3: 40:57

these days, you know, the you're much better off finding finding a library so you don't have to do your own socket programming because there's there's way too many flags to mess up.

Speaker 2: 41:08

There are way too many flags to mess up. And so in but but in particular, that the delayed acts became, that that became the default behavior, and where you and a delayed act is saying, well, it's like, you know, I just received this, and instead of acknowledging it, let me just see if I receive something else, and let me just save the network a little bit. Let me let me see if I can actually bundle some acts together. And, you know, kind of

Speaker 1: 41:35

I And it's also, Brian, I'm going to reply with some data. And so I don't need to send an act because if I'm just gonna reply anyway, I can just bundle it along for the ride.

Speaker 2: 41:44

Yeah. And I kind of feel this is like a, like, a teenager approach, like, taking out the trash or whatever. Like, let's just you know what I mean? It's it feels like

Speaker 1: 41:52

Right.

Speaker 2: 41:52

Kind of Do

Speaker 1: 41:52

I have to do it now? But what if we just wait? Won't it, like, solve itself?

Speaker 2: 41:56

Maybe solve itself. Yeah. And maybe, like, maybe later I'll be kinda, like, walk into a friend's house and I can, like, bring the trash out of that and, like, save myself a trip.

Speaker 3: 42:04

Makes sense.

Speaker 2: 42:05

Right. So it makes total sense.

Speaker 3: 42:08

So And remember, the the the number of packets was really important because until the late nineties, all routing was done entirely in software. So you had all this per packet overhead.

Speaker 2: 42:21

Yeah. Interesting. So you really every packet is sacred, Tom, in this world. Like, we really if you got the opportunity and it is, like, I do think it's really important. Like, great Internet collapse of not g and a 6.

Speaker 3: 42:34

So Except the ones we dropped for congestion.

Speaker 4: 42:37

Right.

Speaker 2: 42:40

But, I mean, the microprocessors are, you know, 12 megahertz, 16 megahertz, 30 megahertz, maybe 40 megahertz during this time, but unlikely. Right? I mean, the everything is just so scaled down, and, you know, RAM sizes are you know, you're talking to a, you know, a workstation in 1986. Tom has what? Probably 4 megabytes of RAM, maybe?

Speaker 2: 43:05

Not even that?

Speaker 1: 43:07

I mean, it I

Speaker 3: 43:08

mean, that was a sun sun 3 sweet spot was, I think, 4 megabytes. So that's about that time.

Speaker 2: 43:13

Right. So you got a 16 megahertz ish microprocessor, 4 megabytes ish. And then what is the actual network speed? That has gotta be, one megabit, or where are we here on networking speeds? I don't have a good intuition for that.

Speaker 3: 43:30

Yeah. If if you were really rich, you could afford a t one line on 1 and a half megabits.

Speaker 2: 43:36

And so when you were so on that Sun 360, what was what was the speed on that thing? Was it

Speaker 3: 43:42

Well, well, it had Ethernet on board at 10 megabits.

Speaker 2: 43:45

At 10 megabit. But okay. But,

Speaker 3: 43:48

one of one of my favorite features, which I was responsible for, is you could do directly drive synchronous serial protocols out of the serial ports and go up to 56 kilobits. Oh, wow. That's pretty great. We could run x 25 and sna and all these wacko things that existed back then.

Speaker 2: 44:09

That's wild.

Speaker 1: 44:11

But so,

Speaker 2: 44:11

I mean, but, Tom, your point that, like, the the these are really important optimizations to avoid sending packets that you don't need to send. And if if if I can avoid sending a packet and avoid dropping a packet, if I can just hold on to this for just a little bit of time and introduce a little latency bubble, perhaps, this could be a big win. And I think either of these things in isolation would be a win. Together, they are deadly. Because you've got you are on the on on one end of a connection, you are saying, well, I've got an act outstanding, and I'm gonna wait till I hear back to before I send the small packet.

Speaker 2: 44:53

And on the other end, you've got, like, well, you know what? I'm not gonna send this act yet. I'm gonna wait to see if I get anything else. And it's like, yeah, you're not gonna get anything else. Like, this is like, they're waiting for you, and you're waiting for them.

Speaker 2: 45:06

And now it's gonna be and in our case, that 100 millisecond band was the delayed ACK retransmit time. That's when, like, we were oh, it's like, oh, wait a minute. I haven't heard back on this thing. So I guess I'm gonna retransmit the act, and then read that that retransmission would would kinda you. And you would would send this thing that had been delayed, and now you have a 100 and the other problem is, like, that absolute time is just, like, outsized given the speed of everything now.

Speaker 2: 45:36

Speed of networks, speed of CPUs, amount of DRAM. It's just like That is remarkable. I

Speaker 1: 45:42

was reading these I was reading the Stevens book, you know, published in 96 or something, and it's got the same constants that we see in your data, Brian. So to to your point that, that those constants have endured 30 years and longer is, is kinda crazy.

Speaker 2: 45:59

It is definitely crazy.

Speaker 3: 46:01

Yeah. The the minimum TCP time out is still 200 milliseconds, which makes no sense if you're in a data center context.

Speaker 2: 46:12

Right? That makes no sense.

Speaker 3: 46:15

But the the problem is these people think of these as TCP attributes when they should be,

Speaker 1: 46:21

variables that are the term variables that come from the routing table. So you should you should be able to annotate your routing table and say,

Speaker 3: 46:29

hey, I know this network is local. Let's turn down all these time out values.

Speaker 2: 46:33

Yeah. Interesting. And is there really, like, an RFC for that, Tom? Or is that was that an idea that had traction at one point? Are we

Speaker 3: 46:40

is everyone just gonna give me a

Speaker 1: 46:42

minute to It was Awesome. It was your

Speaker 2: 46:49

yeah. I exactly. I don't know.

Speaker 1: 46:50

I'm ready to sign up. We're ready

Speaker 2: 46:52

to Great. They're like, we wanna buy too. But yeah. And I think that the then you so the other then you get into, like, a bunch of different aspects of this. When this and I this is where I'm really curious.

Speaker 2: 47:04

You know? Raggy, you're on the on the stage, and I I saw Eliza out there. I know I know Eliza's debugged this before. I would love to hear from other people that have debugged this, and your how did you debug this? Because I I think that the I I know how I debugged it, and it was with a whole lot of detrace.

Speaker 2: 47:27

I mean, it got to the point where I actually, like I mean, I don't I'm not sure how many times I actually, like, counted up the number of times I'd used dtrace on on Friday when I actually debugged all this. And it's like, I'd I'd used dtrace, like, 275 times span of the

Speaker 1: 47:40

it's like

Speaker 5: 47:42

I can't it's like the opposing story if you want, which is, like, debugging actually having noggle turned off and you kind of wanted it on, which was super cool.

Speaker 2: 47:50

Yeah. Yeah. Please.

Speaker 5: 47:51

Right? So it was, like, mid 2012. I just got to Google. We just sold wildfire to Google, and I'm drinking from the fire hose, taking apart all the internal libraries, and trying to figure out what are we going to use. And we've been purchased into ads.

Speaker 5: 48:07

We're a social media marketing company. And so I called out at one point and they're like, Hey, we've got this like ads technology summit. You might want to go and just like see what else is going on. So I go there and there's some folks talking about their first Event system because most of Google stuff for many years before that was all ETL and sort of big batch stuff, and they're kind of moving into the Event space. And so these folks were talking about how they sort of put the system together.

Speaker 5: 48:34

It was the usual sort of big company over engineered madness going on. But they were talking about numbers towards the end and they're sort of trying to advertise the system in like, n 1,000,000 events per core per day blah blah blah. And I was kind of like, well, I was playing with this stuff and I did that many events on my work station with my prototype in the hour before I got here. What is going on? But I had happened to have been walking through like the stubby code and like the various like event IO libraries and so on and so forth.

Speaker 5: 49:04

They were in the prod code base and had noticed standing out from prior experience like, Oh, they're turning our off for everyone unconditionally all the time, which completely makes sense if like everything you do is stubby and web, right? But these folks are now doing Evented stuff. So I came back afterwards and I was like, I wonder. And I went and looked in their library and they've got no buffering on their sockets and they're just doing tiny writes all the time. So it turns out they're stuck in the routing table lock, right?

Speaker 5: 49:35

Because they're constantly spamming back into the network when they didn't actually mean to because they made their own custom framing and blah, blah, blah. So it still comes up, you know, and it's like, sure, it's kind of a mistake. It's like you decided to go and write your own framing in your own C plus library, and you were just like, I didn't put a buffer in front of that. I'm just gonna do raw writes and save myself memory. But you didn't in this case.

Speaker 5: 49:57

And so the simple fix there is like, okay, turn Nagle on, and then you're, you know, that part of the problem goes away. But that fits the universe where all you're doing is a stream of events. It doesn't fit as someone was saying earlier, it doesn't fit this universe where you're doing stubby, you're doing RPC, or you actually want the back and forward. But, yeah, that was like a standout for me because there's been only a couple of events in my career where it went that way around. Normally, it's always the other way around.

Speaker 5: 50:25

It's like, oh, I I just built another reactor library or I'm starting to use a new IO library or whatever, and the IO library hasn't yet learned to turn it off. And so The first thing you end up going looking for is, oh, it seems to be slow for our PC, and that's the problem.

Speaker 2: 50:41

Yeah. That is really interesting. And and just in terms of, like so the answer there the right answer, presumably, is not turning because Nagle is kind of papering over. It sounds to me like a broader issue of, like, there should actually be buffering.

Speaker 1: 50:57

I I

Speaker 2: 50:57

would just the buffering is the answer there.

Speaker 5: 50:59

Yeah. Yeah. I mean, you know, it depends what system you're on. Right? If you're if you're significantly memory constrained, maybe you wanna let the kernel deal with it cause it's already got the buffer stuck around or maybe you can push it all the way down to the nick or something.

Speaker 5: 51:12

But yeah, any high level system, any normal world is totally do it in user space, decide when you wanna send, you know, all of that kind of stuff.

Speaker 2: 51:22

Yeah. Interesting. And then so did you it sounds like you just debugged that one by inspection. Is that right?

Speaker 5: 51:29

Well, it was pure luck. I had, like, literally I've been, like, drinking from the fire hose and I'd happened to have read all of those libraries right then and there, and then they were describing this thing. And I was like, I wonder I wonder. I'm gonna go have a look. And sure enough, it's just like, here's a behavior, and you can go and run it on your workstation and get a different behavior from when you go and run it in production.

Speaker 2: 51:49

Yeah. That's really interesting. And, also, I mean, I gotta say, like, I mean, really lucky. That'd be the it was really lucky.

Speaker 5: 51:56

Yeah. For sure.

Speaker 2: 51:57

Yeah. Because, you know, the other time that I've that I've had this problem, I a bunch of us were debugging this, and it was showing up as, like, Cassandra latency. And it I I didn't actually debug it. It was a colleague I mean, I was debug I was, like, not for lack of trying, but I was starting at the bottom of the stack and could I mean, I was finding all sorts of things that were basically dead ends. And, actually, Adam, even with the state map work that I did, I

Speaker 1: 52:24

don't know

Speaker 2: 52:24

if you ever saw that. I did that to debug this problem, this Cassandra problem. And, which it was, like, somewhat disappointing because I did not actually I just could not understand. Like, it is so hard to understand when you have, you know, it's kind of like you have, like, the toddler that won't put on their shoes for a 100 milliseconds, and then they put on their shoes. It's, like, very hard to debug when you have this, like, someone is refusing to allow work to progress for what is actually a pretty short time in human time and a very long time in computer time, it is very hard to see, I found.

Speaker 2: 53:01

Yeah. And it was another colleague of mine, Eliza Japatnik, who was who, out of all of our shared frustration in this, is, like, I'm actually just gonna go check this configuration file against the configuration file that's, comparing, joint to AWS. And they were like, you know, we don't see this on AWS. We didn't see this on you. And and, ultimately, like, just took this configuration file, like, line by line, took it apart, and found this very important difference around TCP.

Speaker 2: 53:26

No delay. But that was it was just kind of the Raggy, it was kind of more like your situation of, like, just getting lucky and kind of finding it, but really hard to do, I found from from first principles. You know, it it Yeah. Adam, as we're kinda contemplating this, I went you a great thing to go do on Twitter is search for TCP no delay. Did you do this?

Speaker 2: 53:53

No. Oh, it's so great. It is such a so there are the tweets because the tweets are all I mean, who randomly tweets about TCB or delay other than the people who've been just absolutely bit by this.

Speaker 1: 54:06

It's like a mixture of anger and euphoria.

Speaker 2: 54:09

It is. It's like

Speaker 5: 54:10

it's like 1 or 2 people in every group from every programming language ever that starts a new IP stack or a new network reactor library or whatever it is. It's like every couple of people in that in those groups It's amazing.

Speaker 2: 54:22

It is. And it's like, it doesn't have the frequency to be tweeted. I mean, it's like I don't think I mean, lord, help us if TCP no delay is ever trending on social media. Right? I mean, this is, like, not something that you are actually gonna see.

Speaker 2: 54:35

It it doesn't happen that frequently, and so there is and, Eliza had a great tweet. I'm gonna gonna go find it where it's like, shoot. I can TCP no delay to the the equivalent of of forgetting to pull the parking brake, which definitely feels that way. I'll drop that one in. But I think I so I I I love the fact that, actually, as I was looking at this, we got a couple of these folks here who have run into this.

Speaker 2: 54:57

So, Dan, you're here, I think. And you you wrote this tweet in 2019. I love the fact that I put out on Mastodon that I'd hit this, and you were the first one. So like it on Mastodon. I'm like, oh my gosh.

Speaker 2: 55:08

I'm just gonna reach out to Dan and ask him about this. So could you take us behind this tweet? Because this tweet feels painful. Just reading it.

Speaker 6: 55:15

Yeah. No. This is actually a long long story. It wasn't even anything in particular, but, so nowadays, I'm a researcher at Microsoft Research. Before that, I was a professor and I ran a research group focusing on distributed systems and low latency.

Speaker 6: 55:35

And the Nagel algorithm had become a meme in our group that it had just come up so many times that, finally I just wrote on the wall, it's always the Nagel algorithm, because we had to check that every single time. I don't remember the first time I ran into this, but I do remember, one very memorable one, which was we were just starting a project to look at AL latency in distributed systems back in 2012 or so. And we just started running some simple systems, you know, null RPC, memcached, very basic Nginx configuration, and trying to look at the latencies of them. And, we started trying to just trying to understand them, right? That if we have some idea of how fast these systems should should be and queuing theory can tell us what the distribution should look like and we can simulate it.

Speaker 6: 56:42

And then what we wind up with in practice is totally different. And one of the really weird things that, the students I was working with, who've since gone on to graduate and do great great things, noticed was that it really depended on where in the connection you were because, you know, the first couple requests would get delayed, and nobody could figure out why. And we didn't even have any cool visualization tools at that point. I remember we were literally looking at pieces of paper with tables of latencies and I looked at it and said, oh, did you turn off the Nagel algorithm? Because I had run into that before and that's, by the way, a great tip, which is that if somebody ever presents you a table of latencies on paper, you can say, oh, did you try turning off the Nagel algorithm?' And they'll really be impressed.

Speaker 2: 57:47

Did you look thoughtfully at the numbers first before you

Speaker 6: 57:50

Absolutely.

Speaker 1: 57:51

That's right. Oh, no. Let me just let

Speaker 2: 57:52

me just do some calculations in my head. Okay. Okay. Now I've got a question for you. Got one question for you.

Speaker 2: 57:58

That's pretty great. Anyway, and it and indeed it was, in this case, or it sounds like it would turning off Nagle.

Speaker 6: 58:06

Well, that was one of many things. We found many others later. I wanted to mention something else going back to the historical perspective, because as somebody who half heartedly follows congestion control research these days, it's kind of wild to me that we were that people were thinking about congestion on the Internet and what they were thinking about was not, you know, how do we how do we fairly allocate resources? How do, we make sure that we do we get the right allocation, that we control the rates at which people are sending. It's just a one off thing.

Speaker 6: 58:51

You know? Hey. Some of these things are sending too many packets. Let's do an optimization to make them send fewer fewer packets by waiting or by delaying the acts, that all of the other all of the other, like, serious congestion control stuff came later. And, you know, at the time Nagel was doing this, like, there was nothing to stop you from monopolizing the, back on link.

Speaker 2: 59:22

Right. The this is the all this is all kind of, like, cooperative congestion control effectively. There's no nothing happening at the core to actually prevent congestion. Is that am I

Speaker 6: 59:32

Yeah. I mean, there's nothing happening at the core. There's nothing really intentional even happening at the endpoints either. Right? It's just try to send less.

Speaker 2: 59:42

Right. Choose something different. Yeah. And it well, and I think it's it it it shows that it's it's hard to kinda reason about certainly in those early days, hard to reason about that larger system. And everyone is you're only seeing kind of what what's going on at the edge, and you gotta kinda react accordingly, or try to react in the message to the network.

Speaker 2: 01:00:03

So from your perspective today, I mean, haven't we outlived this having us as a default? I mean, surely, this should not be the default. Surely, one of these things should not be the default. Or should we just and do is this something where this is, like, medical residents working 36 hour shifts? This just something that we believe that everyone touched up the software through as part of their character building via software.

Speaker 5: 01:00:27

I feel like having a default here is probably sensible, but we should set the time down to something really small. Like, if you do have some naive dumb software that is doing, you know, individual byte rights or, you know, put c's to your network socket, then, sure, set it to a millisecond. It will still go slow, but it's not the, like, fatal thing. The only bad thing for doing that is that it's gonna make it harder to spot. So I don't know.

Speaker 3: 01:01:01

Problems when the default changes. Right.

Speaker 6: 01:01:04

Well, that's, that's happened. Right? I saw 1, one of those hacker news posts, maybe the one that Nagel commented on was about Go not having it on by default.

Speaker 2: 01:01:15

Yes.

Speaker 6: 01:01:16

And then Yeah. Somebody finding out, that in fact doing one byte rights is not going so well for them.

Speaker 2: 01:01:26

Yeah. And and but Waza, you you authored the the parking brake tweet. We what's what was it what's what's been your experience?

Speaker 1: 01:01:32

So I do kind of

Speaker 4: 01:01:33

have, perspective on this specific question of, like, you know, what should be on by default, which is that, you know, I was listening to, like, Raggy's answer, and I believe you said something sort of offhand about, like, oh, people are, like, authoring one of the the, you know, these, like, new IO reactor libraries. And I have the dubious honor of having been a maintainer of not the primary maintainer of Tokyo, but a contributor, and we get a lot of GitHub issues that basically look like, you know, why is my Tokyo app so slow? Well, you know, a lot of the time the answer is like, did you take the parking brake off? Which is where that tweet came from. The tweet was sort of a reaction to this, like, you know, we just gotta have this, like, checklist of when somebody's like, well, I'm having these performance question these performance questions.

Speaker 4: 01:02:24

Obviously, the first one is sort of like, well, did you compile your Rust program in debugger release mode? Okay. Yeah. You flipped the little switch on the checklist. Okay.

Speaker 4: 01:02:31

Light turns on. Great. Did you remember turn off Nagel's algorithm? But

Speaker 2: 01:02:35

Well, so well, can I just ask what else is on the checklist so I can go check that right now? I can go quietly done.

Speaker 1: 01:02:40

You know, those are those are 2 of

Speaker 4: 01:02:42

the biggest ones that I can think of on Buffered IO in general. It's just also sort of a big one, that it actually comes up a lot. And this is, I think this plays into this sort of, like, question of defaults. Is that, like, in Rust, the, like, standard library IO primitives are all unbuffered unless you explicitly say I want them to be buffered. And the reason they're like so Tokyo also, like, defaults to not disabling Nagle's algorithm when you make a TCP socket.

Speaker 4: 01:03:13

And the reason that Tokyo does that is because it's what the Rust Standard Library does. And whether that's a good decision or not, I think we can, like, kind of debate. But there was, like, this very important sort of like, well, the Tokyo socket should behave the same as the Rust Standard Library socket, except that it's going to set like O non block because it needs to do that, because you're going to use it with the like IO reactor so it needs to be non blocking. But besides that, it really, like, we don't want it to set any different socket options from the socket options the rest of Android library sets, and we want the user to, you know, essentially be able to interchange them except that the Tokyo one you have to put the await word after every time you wanna do a read or write on that socket.

Speaker 1: 01:03:56

I think, I

Speaker 2: 01:03:56

mean, broadly, that's a very good decision. Maybe it would certainly violate the principle of lead surprise if, like, Tokyo is like, no. Look. We've got a bunch of issues, so we set all of these things on your side to make it behave differently.

Speaker 4: 01:04:07

Down side is that now you have to respond to each of these GitHub issues and be like, you know, well, did you take the parking brake off? And I think that, like, you know, the other thing that you could do is the the go thing of, like, well, we're just gonna change the default from the default that everyone has had as the default since John Nagel actually wrote the algorithm. And maybe you get the new the back news comment from John Nagel.

Speaker 1: 01:04:34

Right.

Speaker 5: 01:04:35

Yeah. I mean, every

Speaker 4: 01:04:36

time line you wanna park your car.

Speaker 5: 01:04:39

Right. Every timeline version of these, like, several Reactor libraries in Ruby over the years, like, we disabled these. And I think Twisted disables it by default. Like, I think all the Dimelang libraries end up turning it off by default. I imagine it's turned off in Go by default because they were doing prototyping inside of Google prod, and they just matched what the prod behavior was, was probably, like, really genuinely the primary reason.

Speaker 5: 01:05:09

But, yeah, I mean, like in terms of being surprising behavior or not, honestly, I would come to a reactor library and be unsurprised in either direction. Like, you know, it's just one of those, like, check it off the list. Did you say it was, like, a thing that you start looking for when stuff doesn't behave the way you expect?

Speaker 4: 01:05:26

The kind of funny story is that the specific tweet that Brian mentioned, the the parking brake tweet, came out of, a very, to me, very funny GitHub issue, which was actually on a a Tokyo demo app. There was, our, like, demo app that we have is called Mini Redis and it's like Redis except it isn't very good, and you know it's like intentionally not very good because it's supposed to walk you through like here's how you use the async runtime, here how you use the synchronization primitives, and it's for like having a bunch of comments in it that tell you how it works and also, like, being used in some of our tutorials. And so it's like very specifically, it implements all of Redis except for like Redis Pipelining, so it has no connection reuse, right? Because the pipelining is kind of or it has very limited connection reuse. And the pipelining is kind of the thing that, you know, is the draw through, you know, the the you've maybe seen the image that's like, here's how you draw an owl and it's like draw 3 circles and then step 2 is like draw the rest of the owl.

Speaker 4: 01:06:31

The pipelining is kind of the like rest of the owl of Redis. And it's like very intentionally we did not implement this in the demo app. So there was this like somebody opened an issue and was like why is this so much slower than Redis? It's written in Rust. And Rust is supposed to be fast.

Speaker 4: 01:06:48

Well, it's a demo. You know, like we intentionally did not draw the rest of the layers. And so the comments from all of the maintainers, well, where we, you know, we don't implement pipelining. So if you do, like, one Redis command,

Speaker 1: 01:07:00

then

Speaker 4: 01:07:01

you wait for that command to come back and you can't send anymore. And so that's sort of like, well it's not supposed to be a Redis replacement, it's supposed to be a demo, you know? And the user just like keeps asking all of these questions and is like I'm running all of these benchmarks and it's like well, why are you doing that for one thing? But then somebody replies, and I think this is a third person who is neither the, like, original question asker or one of the Tokyo maintainers who are, like, trying to explain to this person that this is, like, you know, intended purely for educational purposes, and we specifically did not do the complicated thing that actually makes it suitable for production use. And this third person shows up and it's like, well, so I turned off Nagel's algorithm, and it's actually almost as fast as Redis.

Speaker 1: 01:07:50

That's great. Wow.

Speaker 4: 01:07:52

And so that's but I I think that this is actually a case of, like, benchmarks lie. Right? Because it was faster or it's, like, almost as fast as Redis because he was doing this benchmark entirely on his local machine.

Speaker 2: 01:08:04

Totally.

Speaker 5: 01:08:05

Because he's pipelining in the buffer.

Speaker 4: 01:08:07

Yeah. He's pipelining in the Linux kernel. Right? Because he's just churning connections on local host. And so now it's, like, as fast as Redis in this benchmark.

Speaker 4: 01:08:15

But you only get that when you take the parking brake off. Anyway, that's the end of the story.

Speaker 2: 01:08:21

And I have to do so. And, actually, it should be said that, like, relatively early on in debugging assuming, again, this is solely, like, the ones I had this these outliers reproducible, getting it all the way to bug was was a day. It's a long day. And relatively early on, Adam, I don't know if you saw, but our colleague, Robert Bastocki, is like, have you checked Nagel's? And I'm like, you know, this could be something like Nagels, but I just continued to debug it from first principles.

Speaker 2: 01:08:44

And I well, in part because, like, I actually I have found the kind of folk wisdom around performance debugging to be so frequently wrong and the data to be just to Dan, kinda your earlier point of, like, we had this kind of theoretical construction in the network, and the data that we saw out of it was so wildly different. And I so I've I I I think I I have probably, like, overreacted, at least, on some of these things. And I'm, like, I really need to get Nagel's is so brutal to debug that I I really need to check. I I actually do need to actually, go and check that earlier as opposed to insisting on getting the data to prove beyond a shadow of a doubt that it was that it is Nagle's.

Speaker 1: 01:09:25

Speaker 5: 01:09:26

I definitely sympathize with that internal fight, like, of of trying not to just take the esoteric that I've learned in the past. This this whole thing made me realize, like, that I went and looked and so SS in Linux is actually fantastic if you turn all of the option logging on. You can go and see a lot. But I just realized because I just went and looked for it, it doesn't have Nagle in the list of options that it interprets and prints out, which is mind blowing. Oh my god.

Speaker 5: 01:09:55

It's got all the other congestion control stuff in there, but it doesn't have noggle in there, which is just crazy.

Speaker 2: 01:10:01

It is crazy. And, you know, Ryan dropped this into the chat earlier, but I did feel that, you know, one of the things that that and, you know, I talked about this and I talked with Robert as well on Friday. It's like, you know, we we do not have we do not bump a counter when this happens. Like, there's no nib when you've had this. Like, there we have nibs on on things like drops.

Speaker 2: 01:10:19

Right? And you've got, like, somewhat the the we've got, like, standardized way of looking. And we kind of know that if I'm looking at a networking problem, I do wanna make sure that there are no drops. But, boy, you really wanna treat this like a a drop. I mean, it's not, you really wanna have a glaring kind of red alarm when you've actually done this.

Speaker 2: 01:10:39

So we definitely need we got some improvements to the system for sure to make this much more debuggable when and where it happens, because it is just absolutely brutal, to and I get it. I'm curious if other people have debugged it from first principles. And if so, like, what? What what did you use just because it is it it it if it's hard to debug, It is really, really hard to debug because it is so transient. And it doesn't happen in every packet too.

Speaker 2: 01:11:06

I mean, you actually kinda need to have, I mean, even for us, like, we it doesn't happen on every workload. Because if you have a workload in which it's kind of naturally synchronizing around a flush, that is to say there's no other activity going on when it doesn't flush, that flush will happen promptly because the other side's not it doesn't have an act that's pending.

Speaker 4: 01:11:26

I mean, I kinda had the thought that maybe, like, somebody should just open a PR to Tokyo that changes the, like, format debug implementation for TCP streams so that, like, you know, when you print it, if you you maybe just if you use the alt mode, the like multi line format, maybe not, I don't know, it should just say, like, you know, there should be a Bool in there that's, like, did you call? It said no delay on this.

Speaker 2: 01:11:50

Well, yeah, it actually I like that like that idea a lot of that. And I think it also is, like somehow I feel if we didn't call it TCP no delay, but instead called it TCP delay and set it to I mean, somehow, if you saw TCP delays that to true, you'd be like, wait

Speaker 1: 01:12:03

a minute. What is that?

Speaker 4: 01:12:05

Yeah. That that that's the button that makes the computer slow.

Speaker 1: 01:12:09

Right. Exactly.

Speaker 2: 01:12:09

That's right. It it really is. It really turbo

Speaker 4: 01:12:13

button thing where the turbo button actually makes the computer go slower.

Speaker 2: 01:12:18

What is you said SS. What is that? Is that a is that a program that that, that dumps out?

Speaker 5: 01:12:24

Yeah. It's part of part of IP route too. It's like socket statistics. But, yeah, it's got, like, info flags and extended socket options flags, which dumps most of the stuff you wanna know, just weirdly not this particular flag, which is wild. But, yeah, we use it all the time for, like, ACN and everything else, and and, like, you can see sockets in, like, exponential back off.

Speaker 5: 01:12:46

So, like, talking of timers that have not been reset for or not been reconsidered for a very long time, TCP back off is still over an hour. Right? If you you disconnect one end of a TCP socket and you don't have, like, you know, a healthy router or you've got overly firewalled routers in the middle, it's gonna sit there and keep trying for over an

Speaker 4: 01:13:05

hour. I think the close wait time out is also just like kind of ridiculously long. I don't remember what it is though.

Speaker 5: 01:13:12

Yeah. Speaking of people burning themselves, that's, like, 20 seconds or 40 seconds, something like that. It's, like, long enough that web servers get burned by it.

Speaker 4: 01:13:20

Yeah. It's long enough that if you, like, try and start a program that terminated very abruptly and you try to restart it again and tries to bind the same socket, then that program also crashes. And it, like, it really should Yeah. It really should not be long enough that if I, like, type the command again, I can, you know, not open my socket again.

Speaker 1: 01:13:39

Yeah. But then by the time you go to debug it, it works. You'd be more infuriating.

Speaker 2: 01:13:45

The, do you remember the AK test issue, Adam?

Speaker 1: 01:13:48

No. This is the test framework we had at, for the ZFS storage appliance.

Speaker 2: 01:13:53

And our colleague, then and now Dave Pacheco debugging the connection refused that we would see occasionally. When you run the test suite, it's like everyone saw you just start seeing some connection refused. I'm like, what what is that? And but then it would go away. And it would it took absolute and that was ultimately that's what it pulled out to.

Speaker 2: 01:14:10

So you got had connections and time to wait. So, Brian, I question

Speaker 1: 01:14:15

for you on on debugging this. Because I was thinking, like, wouldn't it be cool if we had a case stat or a d trace probe or a MIB or or anything to scream for help when this just happened? Would you have heard that scream? Like, did you check any of these places? Like, did you have seen something that that was obviously going bananas?

Speaker 2: 01:14:33

So here's what I would do. This is an yes. I would have, I think we so here would be my my proposal, would be adding a kernel statistic and putting drop in the name. This may be I may be, like, overfitting for my for myself here, but I did look for drops. And I was because we once again, alright.

Speaker 2: 01:14:57

This is only existing in the network. It doesn't and and, you know, as you we build our own we built our own network. We built our own switch. Like, the mind runs wild with things that it could possibly be, and eliminating dropped packets as a source is is really important for us to do early. We actually and we, you know, knock on wood, but we haven't had issues with drop packets.

Speaker 2: 01:15:19

So I did look for drop packets, and I would you almost wanna call this, like, you know, like, a Nagle resend drop or something, which is a total misnomer, but

Speaker 3: 01:15:27

you but you know what

Speaker 2: 01:15:28

I mean? Like, the as a way of indicating. Like, you want this should be named something that is scary in some way.

Speaker 3: 01:15:36

You could

Speaker 4: 01:15:37

call it, you know, drop, not actually dropped, but delayed. You know what I mean?

Speaker 1: 01:15:41

Yeah. Right. JK dropped. Right.

Speaker 2: 01:15:45

JK dropped. I I did did you see what Ian dropped in the trap, by the way? The the a a Lionel Hutt's, rendition of TCP. They wait. TCP?

Speaker 2: 01:15:54

No. Delay. They got this all wrong.

Speaker 5: 01:15:58

So I've been trying to think what this looks like in perf because I use perf a lot on Linux these days, and it you know, it the flame graph output and what have you is kinda useful, but having spent enough time on future where we had whole system traces that were they weren't quite precise traces, but they were close enough and you get the linear Chrome timeline kind of view on things rather than the overlay summary that is a flame growth, and that was really useful. Like particularly in early system builds for just like what actually just happened. I mean, you can do it in text obviously as well, but it's also kind of useful at these days scale just doing the whole thing in a UI. And I find, like, I really want a good one of these for Linux, and so far, I have not come anywhere close to getting something that is actually decent. But if you like, you could see Nagle there and, like, I know what that stack would look like, but, like, trying to pick that out of a flamegraph would be really hard.

Speaker 2: 01:16:50

You are for sure not gonna pick out of a flame graph. What you are gonna need to see you I do think that you need to kind of visualize every packet in terms of its latency, and you need to be able to see those outliers. And then under because, like, what you are a 100% not gonna see is the actual code path that does this. Right? That thing is just because it's quick.

Speaker 2: 01:17:10

It's very fast to be like I mean, you're you're just not, it's gonna be hard to catch. I and I think that you this is where you wanna be able to instrument the entire system and know that you've done it in such a way that I've got all of the data here from the system. So now I can actually go reason about all of it. I also do feel like Yeah. May maybe we should be, like, assigning timeouts, like, the way we assign, like, you know, like, AS numbers or whatever.

Speaker 2: 01:17:42

You know what I mean? Like, we we we we should assign time outs. Like, everyone gets a unique time out value that must be relatively prime, maybe a prime number of time out.

Speaker 1: 01:17:50

Out. And then the there's some, gnuplot feature that lists them all, like gravitational constants or whatever, so that you can see if you're some multiple of some known esoteric prime integer. That's exactly right.

Speaker 2: 01:18:06

So that's

Speaker 4: 01:18:06

like now you have to, like, deal with depletion of the, like, I can, like, strategic prime numbers preserved for time outs. Like, we're out of time outs. We can't

Speaker 2: 01:18:15

We're out of time outs now. We're out of timeouts. Like, you can't I'm sorry. Like, you can't I I think that would be a great thing that we've we've actually you can go, like, buy someone else's time out. You can go, like, you know, maybe there's some dead protocol.

Speaker 2: 01:18:28

Go take them over or something. But, like, yeah. Sorry. We're out of time outs.

Speaker 6: 01:18:31

Time outs. Sorry. The only time out we got left is 37 minutes. You can take it or leave it. Exactly.

Speaker 2: 01:18:37

37 or 37 microseconds. I have 37 microseconds if you want that one. You can take that one too.

Speaker 5: 01:18:42

Add a little color to it, and then you get another round. And then you can add a sound to it and you get another round. Eventually, you just run out of senses to add to the thing.

Speaker 2: 01:18:50

That's right.

Speaker 5: 01:18:53

The whole visualization thing is is it still feels like a big thing to me. You were talking about this earlier on was just being able to see everything that's going on and having lights and stuff. And I love going to the computer history museum and just seeing all the like, old mainframe boxes and they're just, like, covered in lights. And I'm just like, yes. I know what the kind of intuition that you build up for a system seeing something like that.

Speaker 5: 01:19:14

And you walk into a place, and you're like, I've seen interrupt rates like that before. That's broken. And we just don't have that anymore because we don't have that kind of spacing and and there's too much stuff going on too fast that it's kinda hard to see. But even though even in that world where there's a lot of stuff that's too hard to see, I think there is actually still a universe where even just strobing patterns and so on, we're actually pretty good at, but we just don't expose them all that well.

Speaker 2: 01:19:42

Totally. Is can I ask a dumb question? How do I, like can I post an image in Discord? How do I do that? I'm trying to, like, not pop

Speaker 1: 01:19:51

Do you wanna drop do you wanna post the one you're Yeah. The yeah. Yeah.

Speaker 2: 01:19:55

The the image this is yeah. Adam, could you act as my proxy on earth? And Yeah. Thank you. How did

Speaker 1: 01:20:00

you do how did you do that? I'll show you later, grandpa.

Speaker 2: 01:20:05

Try trying to get me to bed? Yeah.

Speaker 1: 01:20:08

Alright. Fair enough. Well, I gotta

Speaker 2: 01:20:10

I gotta yeah. I guess that's not giving away. That's gonna be earned. So, this is the just again, you're gonna point that this is the, the just the visual of, once I we actually knew what this was, and I had the kind of, like, the 3 machines that I needed to go because we're talking to the the this disk volume exists on 3 different machines in the rack, and I've located the the kind of the the actual sockets on these 3 machines. I'm like, alright.

Speaker 2: 01:20:36

Now I'm gonna go whack the actual value in the kernel. Do not this do not do this at home, kids. Little m d u minus k w action.

Speaker 1: 01:20:45

Wait. Wait. Hold on. Hold on. Hold on.

Speaker 1: 01:20:47

You you're, like, at 40 milliseconds. What about 0 milliseconds? That that that was your plan?

Speaker 2: 01:20:54

No. The my plan is yeah. My no. My plan yeah. Because the the the Nagle limit is expressed as a quantity of bytes.

Speaker 2: 01:21:01

And, like, what is Gotcha. What constitutes what they call a run to packet? And, how about a runt packet is 1 byte, which is off, basically. That's fine. And so and then did that on all three of these machines.

Speaker 2: 01:21:17

And then and then the the the graphic that that this young whippersnapper here, was able to drop in the Discord that I was unable to do for myself because I don't know how to live in modern society, shows the actual latency, and you can see that super clear 100 millisecond banding. And then also you can see why it was hard to debug because there's a lot that are finishing in less than a 100 milliseconds, and then just all those go away. And, I

Speaker 1: 01:21:42

mean, the

Speaker 2: 01:21:42

aggregate latency the I mean, interestingly, like, the the the mode latency for sure went up because we started doing a lot more work Because we were uncorked. You know? And this is where I do think looking at looking at the average is deceptive or the mean is deceptive, looking at the mode is deceptive. I mean, you really do need to visualize the entire distribution, because what you're seeing there is not gonna be something that a number is gonna really convey. And, indeed, the numbers that you might get out of it would be very misleading.

Speaker 5: 01:22:13

Yeah. For sure. I this is this is my pet peeve with, like, operational monitoring for higher level systems at the moment, like, stuff getting fed through Prometheus into Grafana is just, like, full of lies all the time. It drives me nuts. But seeing the graph reminded me of something which is a positive thing out of the higher level universe at the moment is so we've been using a product called Honeycomb and Honeycomb has this amazing feature in it where you can go into a dataset and they call the feature bubble up and you select over a range in the dataset and you say, tell me things which are unusual about this dataset compared to the global average.

Speaker 5: 01:22:50

Then they will just surface their their like 10 or 20 different metrics, and you can instrument tons of stuff. This is all like open tracing style stuff, so it's also the high level traces. But they do this data analysis over all your data for you. And so you can just be grab a range, tell me what's weird about this range, and and it will bubble stuff up. And that feature is it's a feature I'm very willing to pay a lot for that product for.

Speaker 2: 01:23:17

Yeah. That is that is really interesting. I I mean, and I love I mean, I know I think we have a lot of really interesting data visualizations. We can just get Charity on here to talk about some of the stuff they've done. For folks who didn't see in the chat, they're, asking if a violin plot would do better.

Speaker 2: 01:23:33

That is a reference to to Angela Collier. Have you watched any of her any of her videos, Adam?

Speaker 1: 01:23:38

I I I don't get the reference at all. No.

Speaker 2: 01:23:40

Oh oh my god. I gotta unplug her. Oh, she's got the so she's an, an astrophysicist by by training PhD in in astrophysics, and, she just goes uncorked on these various rants, and she's got her rants a rant about violin plots, describing that, bioin plots should actually not exist, that, we're really, it's really great. So I'm I'm gonna drop that into the chat. And if I'm giving you the gift of, of Angela Collier, you're gonna really appreciate it.

Speaker 2: 01:24:10

She's really extraordinary. So Viking, excellent use, excellent crossing of the streams in there.

Speaker 1: 01:24:18

Fantastic.

Speaker 2: 01:24:20

Well, it's been good. It's been very therapeutic,

Speaker 1: 01:24:23

I think, for me. Good. I know you needed this.

Speaker 2: 01:24:25

I did need this. I did need this. This is a little bit as we know, this is I you I can see you looking at your watch thing. Well, I can see my time is I see see your time

Speaker 1: 01:24:33

is up.

Speaker 2: 01:24:34

And he like, beginning to to try to take back the box of Kleenex from me. Me, me maybe resisting a little bit, wanting to hold on a little bit longer, but wipe one more trigger.

Speaker 1: 01:24:44

But this is a great service too. I don't know. I mean, I know that, I I agree with everything that's been said about the oral tradition of debugging being misleading. Yes. And that you don't want just, you know, one person saw it one way.

Speaker 1: 01:24:55

So let's just try to, you know, prescribe, you know, aspirin and see if it clears up.

Speaker 2: 01:25:01

On the other hand Can we just say that, just as a quick aside, the NAPLUS Ultra of that are the JVM garbage collection algorithms.

Speaker 1: 01:25:12

Yes. I was Truly handed down father to son, mother to daughter over the generations.

Speaker 2: 01:25:19

And I feel like because I'm trying to figure out why I have such an allergic reaction to that, and that is the one that I that is just, like, if you've done anything with the JVM, I feel you've heard, like, oh, what GC algorithm are you? You, like, use a different one. The one you're using is, like, it it's the yeah. I'm sorry, Eliza. I'm sure there are some people

Speaker 1: 01:25:36

who are like, I actually sorry.

Speaker 2: 01:25:38

This is gonna be, like, we need a new theater recession now on on g v f g c. There's a patch

Speaker 4: 01:25:44

script that's like, you know, I want to run a Java program bash script, and it's kind of like the 10 lines of, like, x x no whatever, x x mark sweep, x x concurrent, whatever, g c, this much meta size, so on and so forth. That if you want to invoke, like, any Java binary, you kind of have to, like, use this bash script of, like, make the JVM be normal?

Speaker 2: 01:26:10

Exactly. It is it it does. It feels like a a book of spells that's been handed down. So sorry for that.

Speaker 5: 01:26:16

I feel like JVM flags need a content warning.

Speaker 2: 01:26:19

I I

Speaker 1: 01:26:19

I know.

Speaker 3: 01:26:20

No. No.

Speaker 1: 01:26:20

I am really sorry about that.

Speaker 2: 01:26:22

But, Adam, you've been through to that as much as anybody.

Speaker 1: 01:26:24

Oh, yeah. For sure. But it was it was, sort of like magic that was, like, handed down. Like, they, you know, smuggled out of Twitter, like, you know, wrapped in newspaper print.

Speaker 2: 01:26:36

Totally.

Speaker 1: 01:26:36

You know? And then and then unveiled outside of the building.

Speaker 2: 01:26:41

I just have the pulp. I've got Christopher Walken from Pulp Fiction.

Speaker 1: 01:26:44

This uncomfortable cheesy flags carry it through the prison camp. Exactly.

Speaker 6: 01:26:50

In a former life, I was at VMware, and one day I looked around and realized that everybody I worked with used to work at Sun, and they used to work on garbage collectors, and they used to work on different garbage collectors.

Speaker 1: 01:27:02

Oh god.

Speaker 6: 01:27:03

And they

Speaker 4: 01:27:05

I never thought

Speaker 6: 01:27:06

of pitting them against each other to get their their secret sets of JVM flags, but they all would have had totally different ones.

Speaker 2: 01:27:15

So not only would they have had totally different ones, though they are may well be on different sides of one of the great civil wars. I feel like I would well, Adam, once you read a book on on technology's civil wars, I feel I would. Absolutely. Or a Ken Burns 14 parter on, my dear spouse.

Speaker 1: 01:27:36

Yeah. I'm I'm in.

Speaker 2: 01:27:38

Oh my god. Ken Burns, organizational civil war. I I just god. I love it. But so they, it then they may have been on hot spot versus exact, which was a very hot civil war inside of Sun between Labs and Sunsoft, and it left, like, a the the that one burns hot enough that you can mention it to a belligerence today and still, like, they they will still tell you that everyone else is, like, full of it.

Speaker 2: 01:28:04

So sorry. I don't know. All all these are you you were you were trying to bring us home, and I keep,

Speaker 1: 01:28:09

like Yes. Yes. I I I've seen our time is up. Right?

Speaker 2: 01:28:12

That's right. I quite we mentioned it again. But the the yeah. You get these kind of, like, these all these shibboleths that we that we resist because of these stories, like the JVM flags, the GC flags, and so on.

Speaker 1: 01:28:26

That's right. And they But Shubh Shubhla's useful as, like, for your toolbox, not to just blithely misapply. But knowing that, hey. If you've got some bizarre outliers, like, there are some things to explore.

Speaker 2: 01:28:40

And you probably do wanna check for TCB No delay. You do wanna check for Nagle sooner rather than later. It probably is worth I I I admire your desire one's desire to debug from first principles is great, but, also, I would say once you believe it might be Nagel, it's very easy to demonstrate or relatively easier to demonstrate. So that's another reason to investigate investigate that one early. Eliza, if you expand your Tokyo, list of things to check, please.

Speaker 2: 01:29:12

I wanna make sure I wanna go check, because I I I feel that, you know, there are other things like this that are lurking. You definitely wanna have those at the ready, and then they are you can make a good argument, like, hey. This should not be the default. Certainly the and, Raggy, I think you made a very interesting argument about the time out being different, and I think we should either lower it or change it to be a prime value. I I I'm what that

Speaker 1: 01:29:36

hill. Agreed.

Speaker 5: 01:29:36

Exactly. I had one other idea in this, which is which could probably be done quite easily, which is we can make a Wireshark warning filter for it fairly easily.

Speaker 2: 01:29:46

That's a good idea. I like that. I saw you know, I actually I I like the Wireshark warning. I think that's a really good idea where you can see that niggling is happening. I like, Eliza, your idea of of printing out TCP no delay, really, the, like, in the the the the kind of pretty formatting of a debug on a on a socket.

Speaker 2: 01:30:06

I, and Ryan in the chat had good ideas about, like, the about adding a case stat, adding an SCT probe. I think we need to there are things we can do to make this more visible. I think we are not ultimately gonna make this problem go away, and I probably have not debugged this for the last time in my life, unfortunately, but I'd like to believe that the next time, will be a little, a little less painful. And and this is great to have everyone, man, I'm telling you, TCP no delay. Search for that on social media.

Speaker 2: 01:30:36

It's it's it's the tribe. You just identify the tribe immediately. So alright. Thanks everybody. Take care.

Speaker 2: 01:30:43

See you next time.

Mr. Nagle's Wild Ride

Mr. Nagle's Wild RideMr. Nagle's Wild Ride

More episodes

Mr. Nagle's Wild Ride

Mr. Nagle's Wild Ride

Chapters

Creators & Guests

What is Oxide and Friends?