The Rust Workshop Podcast

An update on what we've learned from spending more time with Rust with real coding. The perspective from C#, C++, Ruby etc.

- The Rust Workshop: https://rustworkshop.co/ - a consultancy and practice of interest
- Jim (aka Luxagen) / RotKraken: https://github.com/luxagen/RotKraken
- Tim / Gitopolis: https://github.com/timabell/gitopolis
- Non-lexical lifetimes problem number 3: https://rust-lang.github.io/rfcs/2094-nll.html
- GPT ideas for dependency injection: - https://gist.github.com/timabell/8813d851399908987396c1725aa8b6d6
- The TOML upgrade patch: https://github.com/timabell/gitopolis/commit/8fb0a4336d3431c11d2d61baa693a35ddd090365

What is The Rust Workshop Podcast?

All things Rust. Discoveries, Learnings, Interviews
This show is brought to you by The Rust Workshop https://rustworkshop.co/ for all your Rust coding needs.

00:23
Hello and welcome to another episode of the Rust Workshop Podcast. This is episode two brought to you from slightly cloudy Tadley. Joined again by Jim, who was with us in the last show, so I won't do a detailed re-intro, but I'm a dev of many years. Jim is a dev of many years. We're going to talk about where we got to on our journey with Rust, because it's been a little while. We've all done a bit more hacking and learning.

00:54
Um, Jim, do you want to give us a kickoff on what's new with you and Rust? Because it seems like there's even more on your side than mine. Yeah. Um, you got the drop on me with, uh, spotting the Rust opportunity and the, the, uh, good time it is to be alive with respect to programming language design evolution, especially with respect to Rust. But, uh, as is often the way I've belatedly, um,

01:23
made a thrust at it and done some pretty intensive work over the last couple of months in between other things. So what I've been up to after following your progress initially is that I was developing my GitHub account basically contains mostly projects related to

01:52
data integrity components and programs that factor into my overall long-term data integrity strategy, which in the last couple of years has gone up a notch, quite a notch, largely due to having the, not only having the idea for, but implementing Rod Kraken, which I'll get to. And so what I was

02:23
trying to do towards the start of this year was to develop an ancillary tool for RotKraken and as I got used to doing in the last couple of years I prototyped it in Perl because there is... Let's just back up a bit there so RotKraken as I understand it is about checking the integrity of whatever you've kept on disk you and I are on the same page with this sort of stuff in that

02:52
One of the biggest challenges for someone who has digital stuff for any length of time is what we call bit rot, which is any reason that you put a photo on disk or a music file on disk or something, and then you go to find it a couple of years later and discover that the file name is still there, but there's nothing in it or it's just the one where you accidentally... Or the photo is made of macro blocks. Yeah. Or you've got the thumbnail... Worse, you open up in Explorer, you can see the thumbnail, but that's all that's left and the actual full quality data is gone. So we've developed a couple of different approaches for this.

03:22
And my understanding, I've got to tell you what I think it is, and then you can correct me, which is that Rockcracken's approach is to hash every file in a tree and store the hash in the extended metadata on the file system for each file, which means if you move a file, the metadata stays with it, which is good for reorganizing. And then you can pause the tree and compare trees and what have you. Yeah, yeah, yeah. Rockcracken is some tooling around that, so that creates and checks those attributes.

03:49
and has some capabilities around comparing trees and that kind of thing. This is where I'm getting to, so let me put a pin in the Rotkraken story for the moment. I'll tell a brief summary of how that came to be shortly. But earlier this year I was developing an ancillary tool for that, which is what you referred to about comparing differences in different versions of the same tree. And I started doing that in Perl.

04:19
And then shortly after that, I realized that for, I was testing it on my entire server file system. And it was, this tool was written in Perl for comparing versions of a tree was taking quite a long time. And just partly because algorithmically what it was doing was reasonably complex. So,

04:46
I took that as an opportunity and I thought, well, maybe I'll try prototyping this thing in Rust. So I did that. Uh, and that was my first Rust project in anger, so to speak. And the performance advantages spoke for themselves. Uh, it's the kind of thing that based on in the ideal case based on, uh, log files that already existed.

05:14
and that were in the disk cache, which generally somewhere around 100 megabytes for a whole file system log, it could do the job in, I forget the exact number, but something like passing each log in seven seconds and taking an insignificant further time to do the comparison. So Rust proved its worth in kind of stopping that Perl project part way.

05:41
and moving to Rust, replicating what I had and then moving on with it. After that, I got it into my head, the idea of evaluating the performance advantage for RotKraken itself. I didn't have a serious intention to rewrite RotKraken in Rust, but I thought that as I noticed the Perl version was taking about four minutes to go through the whole file system, of which at least two minutes.

06:10
was just directory traversal, just actually going through the file system. I thought, well, maybe there's an easy win there. So what I've been doing recently is doing a prototype in Rust of the traversal, not worrying about the extended attributes and the rest of the functionality at first. And what ended up happening is that I've actually completed the rewrite and

06:38
The Rust version of Rutkrakken is more or less complete now. Couple of small bugs to iron out. And although, sadly, it's only twice as fast as the Pearl original, with all of the features in, that's still something. And what I found... Only twice. That's good. Sorry? Only twice, I mean, that's really good, isn't it? Yeah, I mean, it shows how blasé I've become about Rust's capabilities and...

07:05
how used I've got to the idea that Rust is a totally viable C++ replacement, that I can complain about it only being twice as fast. But yeah, that's done now. And what was interesting is that it was one heck of a learning experience doing that on multiple fronts. And I'm thinking in the high level details, that's kind of what we can get into today.

07:32
So yeah, so I can move on to a quick pracy of the Rotkraken story, if you'd like.

07:42
Maybe we'll come back to that in a bit. Let's dig a bit more into the Rust side of things for a bit. Maybe we'll look back to the, more of the Rockcracken story in a bit. Maybe we'll jump into a bit of like, so that really kind of summarizes what you've been up to with Rust since we last went on air, hasn't it? Yeah, okay. All right, flip to my side for a bit. So I've had a convenient gap.

08:10
after the last contract finished, which is always a pleasure of contracting in my mind. It'll be a bit stressful in terms of like, what's coming next? Who knows? But I've been taking advantage of that. Not a lifestyle for the risk averse, is it? No, it really isn't. Or the anxious. But I've been taking advantage of that to kind of move forward my ideas for this Rust workshop. And the current push is to...

08:39
generate some inbound interest by looking around at what presentations I could do, getting in front of economic buyers as they're known. So having finished Gitopolis as a thing that works, one of the things that jumped out at me from what I've learnt is the use of dependency injection because it took me a couple of attempts to get that where I wanted it to be and I'd had some previous conversations around that and I thought that might be an interesting subject for a talk.

09:10
So I've obviously learnt a brand new presentation system, which is reveal.js, which is really cool. That allows me to build presentations purely in Markdown with a sprinkling of HTML as needed. So I've got a template together for that. That's cool. I can reuse that going forward. Definitely recommend. And then from there...

09:37
actually reflecting on what did I learn about DI? And the other thing that's big in the podcasting sphere and everywhere else at the moment is chat GBT and AI and how that affects everything. And the fact that you can write code with GitHub Copilot and some of the plugins that interface with the chat GPT APIs, I've mostly avoided that.

10:08
been a bit more on the cynical side of, well, you know, code is just a specification, so if you're correctly specifying it to chat in GPT you might as well have written the code, which I still somewhat believe. But it's been interesting to get a bit more stuck in. I had a demo from someone of actually how they're using it to sort of thrash out some ideas. This is someone from the Megamaker community, which is just in Jackson's crowd.

10:37
So I've had a play with using chat GPT to generate rust. That was an interesting exercise. Using the Gini- Can I just interject for a moment by the way? That thing you said about being on the cynical side and talking about specifications, I think you're right. And I'm starting to think that the way to think of chat GPT and its uses for code is just as a velocity multiplier, literally as a typing aid.

11:07
basically and if you learn to use it well you have to do much less typing depending on how fast a typist you are of course affects how big an advantage that is. It's a reading stack overflow aid to some extent it can it can comprehend all of the documentation all of the stack overflows and synthesize that into like well here's some plain English and some code samples that may or may not work depending on how creative it's feeling.

11:36
How stochastic perhaps? The danger of stochastic parrots. Yeah, I'd say there's more sorts of interesting things like something that I've just heard a couple of things about that I don't know, but sounds kind of highly entertaining is people jailbreaking GPT. So figuring out how to like inject stuff into it and make it do things that it's not supposed to do. That sounds highly entertaining, but more entertainment than like.

12:05
What can we do with it for coding? As in getting it to cuss? That kind of thing, yeah. So yeah, I used it to generate just a simple CLI app in, what did I get it to do? Oh yeah, I got it to, I asked it to generate MD5 deep, which is along the same lines as what we've been up to. So I personally have been using MD5 deep myself.

12:35
to hash my tree of files that I care about. And then I've been keeping a copy of the text file that it produced and just using its audit capability to see whether I've lost anything. Just basically grepping for the magical known file not used, which is not super easy to deal with, but it's good enough for me for now. So I thought, oh, it'd be kind of an interesting challenge. Like, will it generate me a Rust CLI, a Rust?

13:05
command line tool that will just repeat what MD5 does. And the answer was no. In C Sharp and in Golang, it managed to generate a command line program that did compile and did run and did hash all the files in the current directory, which is not bad, produced good outputs. That was cool. No recursion.

13:34
but it didn't implement recursion. And I didn't manage to persuade it to write tests for it successfully either. In Rust, which the guess is that there's just less input data around for the big AI model, it didn't manage to make anything that compiled or it could be that just Rust is hard and it finds it difficult just like we do. So that was an interesting exercise. And the other thing that I used a chat GPT for, which was kind of mind blowing on the face of it.

14:04
this is partly why I got into it, was for this talk of dependency injection in Rust. I thought, well, I need to enumerate all of the ways that we could do the dependency injection. So I thought, oh, well, why don't I just ask GPT for it? And sure enough, it spat out some very convincing plain text descriptions of ways of doing it complete with believable code samples, and I just basically kept asking it more like, well...

14:34
Are there any other ways? Can you enumerate all the ways? And it spat out more and more. And it pretty much covered everything that I was aware of in terms of how to do the dependency injection that had taken me quite a while to figure out. I was pretty impressed, frankly. And I'm actually going to use, you know, I copied that up into a gist, which I'll probably link in the show notes. So you can see the whole conversation. This was done through the VS Code Genie GPT plugin. And I will now...

15:03
do the hard work of turning those into actual working Rust code in a real project, which I'm in the middle of at the moment, to see, you know, does what it says actually work, are there any gaps, and sort of prove these various approaches. And then I can go to the talk and say, these are ways that definitely work for doing a dependency injection in Rust, which is cool. So yeah, that was very interesting. But on the flip side, having tried to use

15:33
GPT a bit naively perhaps. It hasn't really changed my view, like you say, that it's a bit of an accelerator perhaps in places, but the idea that this thing is going to do anything like what we do in contract coding of maintaining gnarly projects or getting a greenfield thing off from scratch, like it's less exciting than Ruby on Rails, frankly, because the speed boost of building a new web app

16:00
like a crud web app in Ruby on Rails versus Rust or Golang or C Sharp is huge. And you can maintain that thing up to a decent extent for the long term, whereas it seems that the GPT one, it might create, you know, your initial scaffold to some extent, and then you might be able to nudge it to iterate on that design. I've seen some demos of people feeding back into it, sort of saying, you are an expert.NET programmer and

16:29
you've been asked to create this. And then it spits out its answer and it says, oh, but you forgot to do such and such. And then it comes back again with a better version. So I haven't really explored that side of it. So on the, what have I done since last time we spoke thing? So working on the presentation, playing around with the chat GBT, it's published on GitHub already. I'm in the middle of this demo repo. I was originally going to.

16:56
extend the GitHub listing to try out the different dependency injection approaches. But the problem with that is, particularly because Rust is so thorough, if I was going to try a different dependency injection approach I have to make everything work and that's a lot of work. Whereas just with a trivial demo project I can do the simplest thing. I've started making a web call to the GitHub API to just get the number of stars, which is our little thing that we can then mock out.

17:25
see what it does. That has introduced async, which is new to me. Uh-oh. Uh-oh, indeed. So we'll see how that goes. I have managed to factor out an async method so far, and it still works like an async function out of the middle of the main thing. And I did use chat GPT to get it to generate the initial API call, and it worked. And that was handy. That would have...

17:54
I did that in about... I think... It took a few minutes to get something that looked roughly right that introduced me to some libraries I didn't know about and some approaches I didn't know about and then it took me probably another half an hour to make it actually work and build so it didn't get it right out of the bag but it did give me a bit of a head start By the way something occurred to me further on the whole chat GPC as a typing aid

18:23
I think in the medium term, I predict that it might synergize quite well with Rust, and in particular, and the reason for that is that when you have something that can comprehend stack overflow information and act as an intelligent typing aid to prototype programs, I think it might shift.

18:51
the skill set kind of distribution of programmers in the long run, away from a little bit away from the mechanics of writing code and more towards being able to read code. So you can imagine that in five years, even with the current capabilities or slightly improved capabilities of ChatGPT, once it becomes integrated into general programmer workflows, the work

19:20
people like you and I might be doing day to day, could be something like iterating a specification to chat GPT to get it to spit out some plausible code for something. And then the ability to quickly read that code and comprehend it and see whether it's not merely plausible but actually likely to work, is gonna become a much more important skill. And to the extent that the design philosophy of Rust

19:49
revolves heavily around making the important things explicit and making code as readable as possible. That might end up synergizing with it quite well in that workflow of getting chat GPT to spit stuff out and then verifying it by eye, then seeing whether it actually runs and fixing the odd thing. So to the extent Rust is significantly more readable than a lot of other languages, that might work quite well.

20:19
Hmm. Yeah, I think I'm still erring more on the side of because the GPT stuff generally gets you a little bit of a leg up and then doesn't work and then you're like well Whatever whatever level of you know what you're doing or you don't know what you're doing. You're a bit stuck with that and because you Still too biased towards toy problems right now, isn't it?

20:46
It works well for toy problems, but as soon as you get into real world complexity, it starts falling down. That's the impression I get. Yeah, like it gives you something that looks like it might work. And then if it doesn't compile, like, well, you kind of need to know what you're doing again. So I don't think it massively pushes people down in terms of the level of skills they need to produce anything worth producing. Right. I mean, maybe I'm just not seeing the future coming.

21:16
That's kind of what I'm getting at. It's not that it does much on that front. It's just it might change the balance of work more towards verification by eye, program verification by eye and away from the mechanics of program creation maybe. Yeah, yeah, it'd be interesting to see how its capabilities grow over time. Yeah, and I was a bit worried about, you know, people getting priced out in the market of like,

21:46
if they're capturing some of the value of being a programmer, how much of that are they going to capture? But it seems at the moment, at least, they are basically selling access to their API pretty cheaply. They've already got a model, they're already charging, and it's very broadly applicable, so hopefully they'll stick with quite accessible prices for the likes of us.

22:15
which I guess means that if you're creating a product which uses their API where you get a very small amount of value per usage and then you are charged by OpenAI, some also fractional fee, then you can still make profitable things off the back of it. But for us, which a program is charging significant contract rates, it's kind of irrelevant amounts of money. So that was kind of positive.

22:44
Cool. All right, what else is new on my side? Kind of gives you a picture of where I'm at and where I'm going. I think I'm gonna carry on working on the dependency injection thing. The other thing that's kind of been interesting to me on the rough side, now that Gitopolis is kind of done, is like, what does the update workflow look like? And if you were to look at the history of Gitopolis now, I've got Dependabot running on it. The tests have been worth their weight in gold. Test coverage.

23:14
I've been able to see that Dependabot has pushed a minor version bump to some library or other. It creates branches, which is really cool. So it creates a branch and a pull request. And if you don't even look at GitHub's web interface, if you're a bit of a command warrior like me, you do get fetched. And what you see is a bunch of branches appear in your repo that are like Dependabot slash library name, which is very cool. And then you can basically ignore those.

23:44
And there's a couple of tools that Cargo gives you. One is built-in, which is Cargo Update, which will do minor version updates, which are supposed to continue to work. So Cargo Update, Cargo Test, and that updates all the minor versions and reruns all my tests. And then I can be pretty confident that it also works and then git commit, git push. And at that point I don't bother releasing a new version because there's no...

24:11
No benefit to the end users at that point, apart from potentially security updates. Let's worry about that at the time. I was going to say, but security. Security, yes, I have to make sure that all the previous versions of Gitopolis now break through some horrible web thing. Oh God, apps. So that's quite a nice workflow. And then Dependabot closes all its PRs because it notices it's done, which is amazing. And then if you want the major versions, there isn't anything in...

24:41
available by default, which is interesting. That kind of surprised me. But there is like an, I don't know what they call it, like an add-on to Cargo. It's a crate in itself called, I don't know what it's called. I'll have to go look it up. But it gives you the Cargo Upgrade command. So not update, upgrade, subtly different. So once you've installed, I think it might be called Cargo Upgrade. I'll have to go and check. I'll put it in the show notes if I remember.

25:10
Once you've got that, you can type cargo upgrade, and then it will do the major version updates and you can rerun your tests. And I've used that. And so you'll see both of those in my history. And that's been working really well. I had one upgrade that broke my tests, which was really interesting. So there was a... What, a minor version upgrade? Oh, I don't know what it was. No, I think, well, it was a...

25:40
If it's major-minor point, it'd be a minor version, but it was 0. something in the Tommel library, and that's allowed in semver, because if you're pre-version 1, you're basically allowed to break anything at any point, and none of the rules apply. It's only when you ship version 1. I noticed that the other day, actually, by accident, that rule of semver. I thought that was very nice.

26:03
Yeah, it's good. And it's worth being aware of if you're shipping crates or any other kind of library and you're doing, you know, you want to be Semver compliant, then when you're pre, when you're version zero dot something, you can iterate away on your API and you're not, no one can shout at you for breaking Semver because you can just point them at the docs and say, well, I'm not version one yet. By shipping version one, now it has a real meaning of like, I am declaring this API stable and I'm actually going to attempt to not break it, which is cool. So they had a,

26:33
fairly significant rewrite, I think, of the Toml crate that I was using that I use for state storage in this Toml format, and it introduced a fairly subtle and not that important change to the actual on-desk Toml format that Gitopolis produced, and because my end-to-end tests are actually comparing the full Toml text, both in terms of what it

27:03
find on desk after some operations have happened, the test failed, say, oh, it used to look like this, and now it looks like this. And that was brilliant because it gave me the opportunity to go, am I happy with that? Do I want to do some work to make it stay the same? And in this case, it was like, nope, that's fine. It's still compatible. It still can read and write before and after. Because both- So was it just some extra items that got introduced or something? It was something to do with the way it wrote out the nested tables of stuff.

27:33
I think, off the top of my head. But, I mean, the Tommel's not as tidy as I would like, but it's good enough and it works. And I was able to look at that and go, yep, that's fine, I'll accept the new format. And I updated my test to match. And then now we've got a get diff that says update the Tommel library. And in that diff, you can see not only the aversion bump in the...

28:00
I don't know if it did a lock file on the other one, I can't remember. But you can also see the diff in the test that shows you how the Toml on disk has changed without even having to run the thing, which is really cool. So yeah, upgrades, lots of fun. You mean in the reference text for the test? That's where you could see how it had changed? Yeah, right alongside the upgrade, which is really nice. Definitely an advantage of going full end-to-end.

28:30
at some point in your test coverage. This is something I picked up kind of from the Ruby crowd, is the Ruby on Rails crowd that I worked with for quite some time, is that one of the advantages of having good test coverage is that you don't actually want all of your tests to mock out all of your dependencies in the libraries because you actually want to show your real system when it runs with all of its dependencies still behaved and then you can pull in an upgrade, particularly the big ones like a new Rails version.

29:00
you want to be able to say, you know, I've done the upgrades and I can be confident all my stuff that I care about still works. That's a huge advantage of having tests that do exercise all the external libraries. Yeah, very cool. I've become a big fan of externalized, what do you call it, externalized end-to-end testing as a result of the RotKraken project, even in Perl.

29:29
I can talk more about. Yeah, dive into that. You were telling me the other day about what you did with that. Yeah. So I used a... I forget the exact name of the test module that I used in the Perl version, but I basically wrote a Perl test program that invokes Rod Kraken with various arguments and does...

29:58
and removes items on the file system in order to do the testing. And it not only passes Rotkraken's output where appropriate and expects certain output, but it also inspects the extended attributes directly on the file system. And this kind of exhausting, exhaustive, but externalized testing that not only, uh, checks what the product itself is doing.

30:27
but checks the side effects is an approach I've used before professionally when I was working on a web service a few years ago. And there again, I was looking at what the web service was returning when things were done to it. But I was also actually doing direct database inspection to see what it was doing to the database. And it kind of seems redundant or like a mixing of concerns,

30:57
damned fine way of making absolutely sure that the thing is doing what you expect it to do. And with Rotkraken, the benefit of that approach, the end-to-end testing kind of speaks for itself. You're testing the real product in a real situation, verifying everything that it's supposed to be doing. It's the least brittle form of testing, and it's the form of testing least prone to missing problems.

31:27
But when, as it turns out, when it came to, for example, translating RotKrak into Rust, I was armed with the massive advantage that once I decided that I was going to finish translating it in full, I had a Perl test script that worked, because it was externalised testing. I literally pointed it at a different executable and I could see what wasn't working yet. Mm-hmm. No work. There's a lot to be said for that.

31:55
not having to put in work for testing.

32:02
Thank you, thank you. Well deserved applause for the test that continued to run even though you'd rewritten the internals. Yeah, I had to iron out two small issues, but it pointed me right at those issues. I got my head wrong about the underlying cause of one of those issues for a while, but eventually realized what was going on, which was an unimplemented file name escaping feature from the original version.

32:31
Once I got that working, suddenly the tests weren't all failing. Um, and the other issue was, um, around exit codes for the product, um, which needed a bit more thought and refinement and revision anyway, because it was, it was a bit of an unhappy 90% kind of Pareto thing, the way I'd been doing exit codes and it forced me to think about it more and I've now come up with a good strategy after a couple of iterations.

33:01
So it was all to the good just getting free tests for the new product with zero effort because I must admit as a programmer, I'm still one of these old school code warriors who hates the idea of spending hours writing tests before you can get to the meat of the actual thing. It's a big motivational factor. So yeah. So right now the Rust product is tested by AppelScript and that's fine for the time being. It'd be nice to.

33:29
Translate that to rust eventually, but it's not an urgent need No, no when when you'll probably want that is when you Fire up your GitHub actions and you can because this is one of the things I wanted in good office Is if somebody opens a pull request I want to be able to see Their diff of the real thing their diff in the tests to show me like how they think they've changed the behavior and then prove That it's changed in the way they say it has and then be able to basically go. Yeah, I'm cool with that

33:59
without having to like fire it up and run it locally. So yeah. Yeah, fair point. VCI automation is where that will shine. Cool. Yeah, to carry on with that end to end testing, external testing thing. Okay. Talk about what we took, well, let me dive in with the component testing thing that I came across in the last contract. Or do you want to jump in first with something? No, you go for it. I'm just wondering whether I should.

34:26
just give the shortest, if I am going to talk more about that, whether I should give the shortest possible background on what Kraken is, what rot Kraken is and where it came from. Um, but you kick off by all. All right. So I was, I, I was interested in, I basically realized that ultimately all bets are off with all forms of storage with regard to data integrity, the whole, that this is some years ago, probably about five years ago.

34:54
But the whole idea that just the just ask the person whose Windows machine had crashed twice and I've rescued their like university dissertation twice. Yeah, I got to get repository corrupted through a power failure in the last couple of years. Yeah, I've had that. Yeah. So. So yeah, ultimately, all bets are off. And much like the internet being designed around the concept of unreliable transport of data. You

35:22
There's no amount of checksumming inside bus protocols on motherboards, checksumming on disk access protocols like SATA or whatever, checksumming via network connections and all the rest of it that can plug all the holes that can lead to data corruption. It's just a non-starter. So I've come to the conclusion that just regular verification is the only way to keep data known good some years ago. And

35:52
So I got into a workflow of using MD5 deep, keeping log files of that lying around on disk and indeed committing them to tape when I wrote large data sets to tape and using that to verify what came off tape. People who know tape will probably think that's ridiculously paranoid, but if nothing else can be trusted, why should I trust tape, even though it's designed for data integrity? So.

36:19
I was using this workflow and the big problem I spotted, as you just mentioned a little while ago, was that the problem with MD5 deep log files is they get out of date very quickly. Just the slightest bit of reorganization of the structure of your data on disk introduces problems with path changes that mean that the logs rapidly become useless to you for

36:47
So for a couple of years, I had this idea kicking around, oh, I wish I could just associate the hashes with the file system items so that they can be carried around with them and that problem is solved. And this idea kicked around for about a year. And then one day while I was researching a totally unrelated problem to do with whether the extent to which Unix file systems.

37:13
can emulate certain features of DOS file systems, like the archive attribute. Spoiler alert, it can't. I tripped over the phrase extended attributes and I looked into this and I was like, holy moly, this is what I've been looking for, for this idea. Which is that extended attributes are basically a way to associate arbitrary binary data with string keys.

37:41
and you can have an arbitrary key value store like that stored against file system objects. And certainly on Linux file systems that applies to regular files and directories. I think on macOS you can't associate with directories, only regular files. Anyway, I thought this is a solution and I very quickly started prototyping the first version of RotKraken which uses four keys to associate

38:10
auditing data with regular files on disk. And I very quickly came up with the first version of that as I'd been becoming increasingly proficient in Perl over the preceding couple of years. And there's a lot to be said for it as a way to, as a general prototyping tool for large classes of problems and as a programming language. I think there are still things that Rust could learn from Perl, frankly.

38:38
and most other programming languages. Some things like the suffix conditionals in Perl are an absolute godsend and I'm mystified that Rust hasn't cottoned onto that and found a way to work it into the language yet. What is a suffix conditional? What it means is that instead of doing if condition, open brace block, do things inside brace block, you just do things if condition in one line. Ah, Ruby has that.

39:08
My word does that save a lot of pointless lines of code, which is great because screen space is always limited. Yeah, you quite often see in Ruby like blow up unless foo. That's the other thing, having unless as an alternative to if in all the places where you can use if, that's another brilliant Perl feature that not enough languages have imitated, including Rust. So those two alone, I think the Rust people should be paying more attention to.

39:37
Enough about singing the praises of Pearl on a Rust podcast. And the name Rot Kraken, I assume, is a play on words for bit rot and the Kraken. Yeah, the Kraken, the thing that gets everywhere because that's what it does. And it's, it was always written from a point of view of very, very explicitly not being allowed to ever touch the contents of files, not being allowed to write them and never giving the impression that it ever does that.

40:07
all it does is read file contents and faff with extended attributes essentially. So, yeah, so I came up with the first version of that pretty quickly. And over the last couple of years, that's got iterated with gradual addition of more and more useful features based on the hard experience of integrating it into my data integrity workflow and being annoyed by things, which is classic evolutionary development and a great way to do anything.

40:37
So the ultimate dog fooding, I wrote it for myself. Yeah, I wrote it for myself and every feature that's in there is a feature that's useful to me. And I paid the cost of developing it because I had a need for it. It's great. Yeah, this is one of the privileges of being a programmer. It's like, I want this tool to exist. Okay, I'll make it. Yeah, yeah. So it very quickly became integrated into my whole data integrity strategy and along with BTRFS.

41:04
BTR book, the snapshot management and propagation tool. And one other thing that I also wrote quite recently and very fast in Perl, which is to do with replicated data on multiple servers and keeping that in sync via snapshots. I've got an overall data integrity approach that is extremely powerful and eats maybe a couple of percent.

41:34
of the time, if that, that I used to spend worrying about data integrity. For the most part, data integrity is something I just don't care about anymore simply because I've automated it away, which is a great place to be. I'm so happy and so relaxed these days because I'm not burning weeks of my life, worrying about the same old nonsense again and again. So, so that's great. And, um, so Rotkraken's evolved over the years into a powerful and essential tool for me. And, um,

42:04
As we mentioned, I recently rewrote it in Rust because there was no real need to do it. I mean, even for a two times speed up, there's an argument that, for example, because of my servers, I have them automated to do nightly rescans of the file system and to initialize new files and to update files that have changed by looking at the modification timestamp.

42:35
Um, that's a process that maybe assuming no major data has appeared on the system takes four minutes and rewriting it in Rust has reduced it to more like a minute and a half. Big whoop. It's an automated process. Who cares how long it takes within reason. However, it's been quite the learning experience and I'm still glad I did it. And I'm still feeling good about the possibility of moving RotKraken forward as a product.

43:05
based on being fundamentally faster. And based on the degree to which Rust prevents you from not thinking about things you don't wanna think about. It makes you think about where errors can occur and make decisions, conscious decisions on that basis that simply didn't happen in Perl. So it's a better product than the Perl version, even with the same feature set for that reason, because it's forced me to think things like

43:35
IO errors, permissions errors through properly and develop philosophies about them that I can bake into the product. Yeah, this is the result type, isn't it, in Rust? Mm-hmm. Yeah, I found exactly the same thing in Godopolis. I'd have something that could blow up and something else that could blow up and then something that called both of them and I was like, oh, now what? And ended up having to create a compound error type.

44:02
which could be one of these. And then I could use like the question mark to be like propagate up or a mapping or whatever. I can talk about that for a moment. The major part of the work of translating RotKraken to Rust was in fact in replicating its file system traversal. The problem with RotKraken is that it has, it relies on.

44:28
the existence of certain what I call metafiles, which are.rk.quiet and.rk.skip. So skip is a way to just stop it traversing parts of the file system. That's essential on BTRFS because BTRFS's snapshots are all live on the file system. All the contents are there. If you don't skip your snapshots, you're going to amplify the amount of time it takes to traverse the file system massively to no good effect.

44:56
because they're all read only anyway, and they're all outdated snapshots compared to the live data. And then there's rk.rk.quiet, which is a way of telling it to do its usual work, but not to report explicitly on those items. So that's a useful feature for sensitive data or other things that would just generate too much noise in the logs. Things that change regularly, for example, like slash et cetera.

45:26
So there are those metafiles and they live in specific places in the file system and their contents apply to where they live in the file system So the major problem of rewriting RK was That I needed to replicate the pearl versions traversal pattern and that traversal pattern was dependent on those metafiles So to cut a long story short, I started with walk der which is a crate for rust

45:54
that actually I think was originally developed in Python, generic file system tree traversal crate that exposes it via what you might call external iteration in the sense that it lays the whole thing flat and you just get notifications of a series of files and you can see what paths they're at. Do you get like a, do you call it on it or into it or something? You can if you want, but it is

46:24
actually an implementation of the standard Rust iterator patterns to an extent that you can use things like 4x in y and You use them implicitly Without having to call it or into it. Yeah so The problem with laying a tree structure flat via that kind of external iteration is That it's hard to know when you're descending and ascending in levels within the tree structure

46:54
And that turned out to be a critical problem for Rod Kraken, because the interpret the spotting these metafiles and interpreting them correctly relied on knowing that accurately. So the walk, the kind of approach of linear iterators just didn't work well with that. And I spent a week or two trying to make it work. I did a lot of work on analyzing whether some, the new item is a file or a directory.

47:23
and looking at its depth indicator, the integer saying how many levels deep it is in the hierarchy, trying to work out an algorithm for deducing descent and ascent based on looking at the new item and comparing it to the depth and the nature of the previous item that was seen. And heaven knows I tried to make that work, but in the end I thought this is eating an inordinate amount of time. Multiple times I thought I had it nailed. It's just not working.

47:53
So I forked effectively the walk, dirt, crate, and I started a process of continuously morphing it into internal iteration in such a way that I get explicit notifications for descent and ascent so that I could solve the problem. And that's the approach that I pretty quickly made work. But the point I'm getting at here is that I...

48:19
I had kind of deliberately avoided knowing anything about result up to this point in Rust. Just seemed like something that I really didn't want to have to care about when I was learning the language. I just thought, yeah, I'll stick on raps everywhere. If it blows up, it blows up. At least I'm learning and getting something that works. That process. Right. Right. That process of iterating that library while having it work at every stage.

48:47
between two radically different topologies.

48:52
didn't allow me to continue that ignorance. And the process of iterating, iteratively refactoring it from between those two topologies forced me to become very conversant with result and with errors and with how they work in Rust, just because I had to keep dealing with the same things again and again. And there was an interesting side note to that, which is that I started developing patterns.

49:22
in a way that I remember doing in C++. I mean, it's been a while in C++ since I've developed a new pattern for dealing with a certain class of problem because I've been doing it so long. But I remember doing it 20 years ago. I went through a period of learning where I was devising these patterns and developing a programming style in C++ based on those patterns. And I could see myself doing it as I was working on this Rust code, morphing this library.

49:51
I could see that having to deal with results and errors time and time again was making me do some clever little things like say you're doing a bunch of things that can all error inside a function and the function, you don't want the function itself to return a result because you've decided that the errors should stay contained. You should deal with them in the function and they shouldn't leak out. But at the same time, the errors that can be generated by the things you're calling

50:20
might differ in type, they'd be different types of errors or they might even all be the same type. But if your function is returning a result, it's not the error type you want in that result. So all those kind of incompatibilities and containment issues led to me doing a few times a neat pattern where I would stick the meat of what the function was doing in a closure.

50:47
inside which I could use the question mark operator, because all the errors in the things that we're calling were of the same type. But, and so I could use the question mark to make that code much briefer. And then I could add some explicit error handling in the court where I would actually call the closure after defining it, to deal with all of those things in one go and contain the errors.

51:15
So little patterns like that started emerging. Sorry? I look forward to seeing that. Yeah, yeah, that's interesting. Again, just totally non-ideological, practical way of dealing with things. And developing patterns that work and patterns that at least to me make the code maximally readable. So that's been a heck of a learning experience just modifying that library.

51:44
and deciding what features to strip out based on, there were two categories of that. The one category was that Wokder had some features that struck me as obviously silly, just unnecessary complexity. Things that can be done by the consumer with no more effort than having it in the library that just complicate the library unnecessarily. Things like min-depth and max-depth, which we might be familiar with.

52:12
from using a tool like Find. Obviously the author of Wokder was trying to emulate the basic capabilities of Find. But given that the nodes that you're handed when you're iterating contain a depth field, why precisely would you bother having min-depth and max-depth in the library? You can just do those checks in the consumer. So I got rid of it. Cool. That's an example. And then there was other stuff that I just didn't need for RotKraken, like the ability to follow symbolic links.

52:42
which really should go back in at some point when I publish it as a general purpose crate. But symbolic links create the implied problem of potential loops if you're following them in the file system. And that was the complexity I just really didn't want to have to deal with with this particular thing. So I reluctantly stripped that out thinking I'll put it back later once the code is in a more final form and it's easier to add it back.

53:12
It's one of those things where the time trade off of continuously deforming some code you don't fully understand into a radically new topology versus just putting it back when you've done the transformation is it's not always obvious which way to go on that one. And I always err towards not stripping stuff out, but sometimes it's the right way to go. So I've got a couple of questions. We use the back of all that. One is about licenses and the other is about what you've learned. So let's start with licenses.

53:42
So what are you releasing Rotterkrakken under and what does that mean for the things that you're using and modifying? Okay, so I am... There is a question here about the licensing of its dependencies. I've not evaluated that yet, but the approach I want to take if I... Yeah, yeah, the approach I want to take if I can is...

54:11
to release the Rust version of RotKraken under the GPL 3 or later license as I did with the Perl version. I believe it's a product of sufficient general usefulness for data integrity enthusiasts that, I mean, one could go the MIT route and just encourage adoption at all costs, but I thought actually,

54:37
it would be nice for something this useful for somebody else to occasionally do a bit of work if there's an improvement that's worth doing and for the main product to get the advantage of that so that was the motivation for gpl three or later i consider the whole gpl two versus three thing but i decided that allowing people the loophole of using it in web contexts uh... and and just bypassing

55:06
was not something I was interested in allowing for. I can't think of- You're talking about the Aferro GPL, the A GPL.

55:14
Possibly you might be your mind might be a bit sharper on these distinctions than mine right now But I believe one of the problems with even GPL 2 was that it just Because it pertains mainly to compiled code Using it in an interpreted context gets muddy and using it in a web context where the software is not actually getting distributed leaves loopholes open Yeah, that's still the case in GPL 3 if

55:43
If you write some tool and that becomes part of a web service where the tool is run based on user request, GPL3 won't prevent that being extended and not shared. Whereas the AGPL is the one that says, if this is provided as a service, then you also have to share any alterations. So that's why I went for that on the top list, because that's the mightiest protection of, you know, oi Google and friends.

56:12
don't make millions of dollars without sharing what you've done. Okay. I'm making a note of that because you've just educated me in a useful way. Um, but my, my, my thought is less people like Google and people like, um, Synology, for instance, who now routinely release products that had, that use BTRFS and who might conceivably want to integrate something like Rottkrecken into their web interface at some point. So

56:41
Yeah, that might be a thing to think about. As for the file system traversal library, I'm thinking more MIT for that, by the way.

56:51
That's a simple thing. I guess that depends what the original was in as well. I believe the original was MIT and I have checked that. So the fact that I've utterly, utterly bastardized it into a new form is not a big problem in that context. Yeah, and then you can include that in an AGPL thing and you're all good, I believe. Right. I am not a lawyer and all that.

57:20
No, but you're the closest thing I have when it comes to software licensing, so I might be coming back to you on that one Yeah, yeah, that's interesting stuff. Um Was there a second question? Yes So I think all of that background and detail on rock-crack and and kind of where you ended up and why it leads nicely into What you've learned about Rust what what you think is interesting good bad, particularly your comparisons with C++ since we last kind of

57:48
covered our enthusiasm for the new thing, because you sort of hinted at some things I didn't necessarily follow. Right. So yeah, let's have a bit of a freeform dig into like, what did you learn? What's your current thinking about Rust? And would you use Rust in future and what for? Okay, okay. I, tell me if I'm getting too into the weeds, but I'll try and start general. No, I think weeds is good at this point. I'll try and start general and.

58:16
gradually explore the roots, but stop me if I'm getting down a rabbit hole. So my general impression is I'm now in a position with Rust that I probably was a few years into C++, which is something in itself about the speed with which I acquired competence in Rust. And that point is... Yeah, to be clear, that's like in a few months. Yeah.

58:45
couple of months. Yeah. And it's mainly the experience of the 20 years of C++ experience that has force multiplied my time to that degree, we have to be clear. And yet it says something about rust itself. That a mind like mine can become comfortable with it so quickly. So my, I'm at the point now where I'm not a newbie anymore. And I've used it enough to be annoyed by it in a few minor ways.

59:17
and I can articulate what those are, but my general comment is that despite those annoyances, I still think it's the future, and I still think that it is going to start a revolution in software, in the sense that it's moved the needle so far in one generational move between languages from C++ to Rust, because I consider Rust

59:41
the only serious contender I've ever seen as a successor to C++, which I think I mentioned on the last episode. And that is a very load-bearing statement, and there's a lot to be said about that that I could say, but I think we went into that at the time. It's about the design philosophy, the largely non-ideological design philosophy, but still highly principled in important and pragmatic ways. And so in that context,

01:00:11
There are a few things about Rust that have started to annoy me now that I've started to use it, but I think it is going to move the needle in a radical way on the trade-off between software complexity and reliability, not to mention performance. The ability to get reliability and performance by writing very complex things.

01:00:39
or how can I put this, the ability to tackle very complex things in Rust without incurring such a massive reliability penalty that it becomes impractical, I think is a huge improvement on C++ in general. And the fact that it also achieves most of the, easily achieves most of the performance of C++ is kind of a side benefit in that context.

01:01:07
So that's my general thought. It's still the future. And I think over the next 10 years, the further to do more and more stuff in Rust by all kinds of organizations and individuals is going to create a rolling revolution that makes software of the same complexity noticeably more reliable and performant, but also allows more complex software to be built that just wasn't practical before.

01:01:36
Let me let me drill in a bit on that. So when you when you say performance, I can I can hear in the back of the little voice in my head some of the C sharp programmers I know kind of screaming, but C sharp ridiculously fast because I know like they've optimized the hell out of their web servers and what have you to compete on benchmarks and what have you. So when you say fast like.

01:01:59
What kind of do you mean by that? Because there's sort of two views of that, isn't it? There's the C++ programmer view of fast, and then there's the JavaScript C sharp programmer view of fast. So yeah, I'd put a bit more color on that. OK, so.

01:02:16
There's a question of how fast in absolute terms it's possible to get an algorithm to run. But I'm coming to the view that that's a relatively uninteresting way of looking at it. If you're writing in Java, say, there are real limits to just how fast you can make something. Even in Perl, Rodcracken takes twice as long as the Rust version does to do a whole file system traversal on my server. And...

01:02:43
So that there's that there's there's the absolute view of what the apex of performance is that is achievable. But as an old hand programmer has been around for long enough, it's really about trade-offs for me now more than anything. And it goes back to what I said about how complex is it possible to make software without suffering a catastrophic reliability or performance penalty. So

01:03:11
I can say something about C sharp for instance, which is that I agree. Aside from C plus plus, I think C sharp is one of my favorite languages. I haven't done a lot of it, but I've done enough to respect the design work that's gone into it, respect the optimization. And I've proven firsthand myself in a professional context that if you know what you're doing,

01:03:39
C sharp can be made pretty much as efficient as C plus plus and as fast. I've done it. I've watched, I've watched the face of a developer on my team as I waded in to a piece of C sharp code that was dragging in a web service in a professional context and taking 10 or 20 seconds to do something relatively simple, um, that involves some statistical analysis of large data sets.

01:04:09
And I used the very small number of principles that I've established in my own mind through my 20 years of C++ on what makes fast code to restructure that code quite effortlessly and literally get a hundred times speed up in performance. I mean, his jaw dropped when he saw it. And the only wrinkle in that process was that there was a moment when what I'd done had failed to take account.

01:04:39
of the garbage collector. Once I became aware of that, I was able to revise the approach slightly because that's the major difference between C++ and C-sharp code written by people who know what they're doing on the performance front. The garbage collector fundamentally changes small parts of that calculus and you have to take that into account. It's harder to do certain types of design optimization that you would do in C++ in C-sharp.

01:05:08
because you don't get deterministic destructors and a couple of other reasons to do with object layout. So with that born in mind, I've proven firsthand that there's nothing wrong with C-sharp's performance when it's done right according to the right principles. And so unless you've got anything to ask on any of that, I can go on to the general point that I've been thinking about on and off recently, which is

01:05:37
What is the essential difference between C++ and Rust from a performance perspective? I have one thought which is kind of what I was gonna ask earlier, which is with that sort of in mind, the C sharp, C++, Rust thing, in terms of like use cases, I've started to think of languages a bit more in terms of use cases. So, you know, Golang, great for 10,000 engineers or working on a shared code base.

01:06:07
because it's very consistent, anyone can drop in, but it's not that much fun, and it's relatively limited in its language and optimizations. C++ historically has been like, we need every last drop of speed. C-sharp, great for a team of a hundred mediocre engineers. There's less so these days with the more complexities in there, but generally they could get something working and it wouldn't be horrendous. It could be refactored where it mattered. Like I've sort of thought, well, I don't...

01:06:36
I don't really see the C Sharp windows houses making the jump to Rust particularly unless they're one of these places that's led by whatever the programmers want to do that's interesting that will let them do it, which is a different game. But for more of the pragmatic, we need a big bench of people, we need them to be good enough. C Sharp's really got a forte there. Build it in C Sharp, ship it to Azure. All right. So I'm kind of interested in...

01:07:05
But it sounds like you're talking specifically about web stuff when you ask the question. Well, yeah, I mean, that's where I kind of live. But in terms of what I'm getting at is what would push in your view, you know, an organization or a group of developers one way or the other, because, you know, performance, you're generally not going to win that one on the C sharp programmer camp.

01:07:35
not throwing exceptions and not having null problems. There's definitely an advantage there, like that is something that infects the C sharp ecosystem. It's like, oh, a null reference error has popped up 20 layers down the stack. This continues to be a problem. I mean, they've added retrofitted some null checking stuff to the compiler, but frankly it's not very good and I'm not...

01:08:01
very impressed with what I've seen. You can annotate stuff saying this is never null and then you can still get a null ref exception on it. So that's kind of tells you what you need to know about that. I remember when you discovered that. So I think we were both deeply unimpressed by it. Yeah. So Rust has got a better start on that kind of stuff. And I can imagine, you know, a microservice that's handling huge amounts of traffic where you might go like, okay, that particular one needs to be in C++ or Rust or Golang or something where

01:08:31
you can get much better throughput. The GovUK router is my favourite example of that. It's the microservice that handles every piece of traffic that goes to Gov.UK, and then it decides which microservice is going to handle that subpath of your URL. So that has to be super performant. They had that in Ruby on Rails originally, they rewrote it in Golang, and immediately got like a hundred times speed boost. Rust obviously is going to be the same kind of ballpark.

01:09:01
The async stuff is interesting, C Sharp's obviously quite well trodden with async now GoLang's got its own way of handling this stuff Rust has async, my understanding is some of the crates are a bit immature on that front but it's basically coming it's not going to be that long before you're going to be able to pull down the crates and write your back-end web service and have all the async stuff that's all good

01:09:30
So yeah, in terms of like, what's going to be the thing in your view that drives, you know, we've got performance, we've got like the quality of the language, the error, error proofing as it were, like the failure mode problems.

01:09:45
Yeah, talk about where you would push to rust, where you wouldn't push to rust, that kind of thing. Okay, I think that this question of where a C++ and rust fundamentally different is very illustrative here because it shows this trade-off I'm trying to talk about. The trade-off of complexity of problems you can solve versus reliability of the software you write to solve the problems.

01:10:14
versus performance of same. And my view in general, I think, is that, we've seen this, for example, with what you mentioned about not throwing exceptions in C-sharp web apps, for instance, because exceptions are known to be a performance problem. At the very least, that can enable things like DOS attacks, even though they're a very elegant way of dealing with control flow.

01:10:42
in fallible problem spaces sometimes. And so there's that performance aspect of it. And you see this also with things like Golang, where the model of developing web-based stuff in Golang is based on the recent discovery that multi-threading of web servers just doesn't cut it.

01:11:10
with the number of threads you need for a high load web server, the sheer amount of virtual address space that has to be allocated in the server process for all the stacks of all the threads, it can become a crippling problem in itself. So that whole move towards, no, we're not even going to multi-thread. We're going to rely on, on other mechanisms, which are all kind of fibers or variants there of ways of

01:11:38
Async basically, but the big push to async in C sharp web development and other languages in my opinion recently is down to the fact that threads have become come to be regarded as too heavyweight because of those stack allocations of virtual address space. And so you just you need a mechanism that's more lightweight than threads for handling transactions that consist of a lot of component operations. And that's where things like async comes in.

01:12:08
because it doesn't really care what thread it runs in. And the same long running transaction can get handed off between multiple threads in the pool. And nobody really cares about that because async takes care of tying all the steps together into one cogent process. So that's the kind of, my kind of secondhand understanding of the background for this whole way of thinking in the web dev world. And

01:12:38
One of the reasons that I don't particularly like things like async is that they're a purely pragmatically driven solution to this real pragmatic consideration that if you're not careful horribly overcomplicates the code you write. What you really need is a lightweight thread abstraction that solves this problem. What you don't need is to fill your code with a bunch of rubbish about async calls.

01:13:07
That just makes it much much less readable and more complicated So I think there are better solutions than async to that problem, but moving on from that This general point about the trade-off between performance reliability and problem complexity I think people have been like me have been hoping for a lot of years that this that Certainly the dream of multi-language programming will come to pass and it just hasn't

01:13:34
People generally don't develop specialized libraries for hotspots, write them in C++ and then integrate them into other languages through binary linking. And there are reasons that hasn't come to pass that are too tedious and annoying to go into. But that I think one of the reasons for that, or I'll get to one of the reasons, I think, is overriding. So,

01:14:03
What I have thought for years is why don't people writing web apps just outsource the really hardcore tight loop stuff that has to be super performant to a C++ library and you just have web apps that have a C++ library that's got that core business logic that has to be super fast. You can write the rest in some nicer language that's more readable.

01:14:26
all of the user interaction and presentation stuff can still be written in something like C-sharp, Ruby on Rails, doesn't really matter. Why hasn't that happened? And I think the answer is not clear cut, but it just comes down to this complexity, reliability, performance trade-off. Yes, you can write your backend in C++ for your web app in order to solve the performance problem. In so doing,

01:14:52
you will require specialized skills of which has a very limited pool in the market. Um, and that segregate your, your, your skills needs in an annoying way. The programmers are no longer fungible. You need specialized C plus plus programmers to maintain this library. Um, but I think just the, the overall weight of the reason, uh, um, informing the reason why this hasn't happened more.

01:15:21
is just yes, you can get performance in C++, but you can't easily get enough reliability at the same time. It's just too annoying from that perspective. And I think that might be one of the major reasons that this paradigm shift hasn't happened yet. And one of the reasons I'm excited about Rust is I think that that moves the needle on that three-way trade-off to the point where that might finally happen. It might actually be practical for developers of...

01:15:50
web apps, working in organizations that are sensitive to hosting costs to justify introducing Rust backend components and justify the skills divorce that comes from that because you can do it without compromising reliability significantly in a way that works for the overall business. So I think we might start to see that happen because of the existence of Rust. So if somebody split out a backend PC into C++

01:16:20
then they'd run into these reliability problems because of how hard it is to write reliable C++. Yeah, you need to be very, very good and have a lot of experience in C++ to write remotely reliable code, at least beyond a certain level of complexity. And I'm proud to say that I've developed a couple of products that are extremely reliable in C++ at the same time as being extremely performant, but it's the product of 20 years of experience. And even then, it's not totally reliable.

01:16:50
very occasionally the stuff still crashes and fixing that last 0.1% is non-trivial because even replicating those cases or knowing vaguely what's going on in those cases is where do you start? It's a difficult problem. Right, do we have a word from our sponsors as it were? And come back in two minutes. Okay.

01:17:19
Right, so when we left, I was asking about parameters for real usage, we got a little bit into the performance stuff, so let's carry on digging in. So I'm still trying to gun for the, like, what would push us to use Rust for real and real projects. You mentioned about the difficulties of having multiple languages in a project, and that's certainly something I've seen in discussions around clients saying, well,

01:17:47
We want to have a language that our team does. So they're super hesitant to bring in another language. The DFE that I worked with pushed everything to Ruby on Rails because it was a good fit for government, but also they had all these different things and it meant that teams couldn't move from project to project. There's a lot to be said for having one thing. You talked about C++ being difficult to make, super reliable if you pull out a piece.

01:18:16
talked about having like wrappers. I have seen a bit of like C programs, like there's so much like basic C code out there that does all sorts of things in the Unix world. It's not uncommon to have a C library that's then wrapped and shipped as a library in other languages. I think some of the Git stuff has happened in that way for a bit, but they do tend to then end up rewriting in purely in their own language and shipping libraries that way.

01:18:47
There's an element of organizational evolution to this as well. I think that we might, this could be wishful thinking, but I think that we might be coming towards the end of the era when organizations can afford the luxury of having a homogenous skill set in their team. The rise of DevOps and the rise of

01:19:15
separate operations stuff in general as a dedicated function that requires funded staff positions just to keep infrastructure running, especially given the explosion of web infrastructure and massive companies doing essential functions via web infrastructure. It started to normalize the idea that you have to have some skill specialization within tech

01:19:44
So in that sense, I guess part of the reason that, say C++ hasn't caught on as a way of writing backend performant libraries, where it really counts, is part of the fact that the industry was just a bit too young up to this point. But I suspect that that change is coming over the next 10 to 15 years, where at least large organizations just,

01:20:12
start understanding that it's a cost of doing business that you have to have diverse and to some extent divorced skills in your workforce to handle operations, to handle development, and maybe given the advent of things like Rust, the maintenance and development of backend components as a separate discipline, nevertheless tightly integrated into the larger development organization.

01:20:42
I can foresee any decent size organization understanding that they need one or two rust positions on an ongoing basis, forming a team that's responsible for just maintaining the separate hot code path stuff that lives in a library and that the evolution of approaches to organizational structure and communication has started to provide companies

01:21:12
with the tools that they need to manage that situation without too many pathologies of the kind that you get when you separate library teams in the past. I'm seeing some of that play out. So I'm seeing platform teams being your thing. So you've got your generic web hosting of Azure and AWS and what have you. And then the organization has particular needs like regulatory needs or auditing needs or whatever.

01:21:40
and just the need to have one way to do things. So they spin up a team who creates a platform on top of AWS or Azure or whatever. And then that's ideally the preferred way for all of the microservice teams to build and deploy. They might have some templates that they kick off projects from that are kind of the way of doing it. So then you can have definitely like the platform team could write all their stuff in Rust potentially, but then the actual...

01:22:09
Backend code could potentially be Node.js or C Sharp or whatever. That's one of the ways I'm seeing that play out. Another one is that just literally different microservices run by different teams. So you might have a finance team that runs the microservices that might have a completely different culture and a different funding structure. And then you might have an e-commerce web front-end team that has their culture and their needs and their technologies of choice. So that's another way it pans out.

01:22:39
And then the other one that I've been hearing about is like these kind of enablement teams. So you've got say, you know, five or 10 teams who are all running their own microservices. They have some shared problems, they have some, you know, shared tooling that they need and you might have another team that doesn't work directly on a microservice, but basically goes around doing whatever it's going to take to help these other teams to be excellent or to have the tools they need. So they might write.

01:23:07
you know, some shared open telemetry code or something like that, that then all the other teams can pull in or like kind of like a shared library, maybe, you know, maybe they'd write a crate, maybe they'd write a new get library serves some shared purposes. So yeah, I'm definitely seeing some moves in that direction. I think there's a useful distinction to be made talking about reliability, by the way, because from a lot of the things I've just said, it sounds like

01:23:36
I sound like a younger version of myself who...

01:23:42
believes that reliability is fundamentally important or that performance is fundamentally important. And we see all around us these days, the evidence in phone apps, desktop software, web apps, that reliability and performance, neither of those are as important as people like us would like to believe, and that companies can get by on surprisingly crappy tech and still make large land grabs and make a lot of money.

01:24:12
and that by default they will often tend to do that. But I think there's a slight distinction to be made here. You can acknowledge that reality while at the same time understanding that the kind of reliability problems that crash your web server are just not permissible in a large organization. And certainly the kind of reliability problems that even throw unexpected exceptions often enough to cause significant unnecessary web server load

01:24:42
are not really acceptable given how much people are focusing on web server performance lately. So there's a distinction to be made there. Don't forget cost. I've seen quite a few places talking about their, you know, their Azure bill being too high and by writing in a lower level language, you're going to impact that. That's the distinction I'm trying to draw that there are some forms of unreliability that have very real and very easily measurable cost implications. And on the web server side.

01:25:13
there's no doubt that reliability and performance now are considered paramount, even if in the general software industry, there's a woeful lack of attention to those things. Yeah, and I'm seeing that monitored with these exception reporting tools and telemetry tools. So you're seeing graphs of how many of your web requests coming into your API are successful and how many of them are erroring, and what's the reasons, and seeing spikes.

01:25:42
you know, the garbage collection thing that we talked about earlier, you see a spike in, a regular spike in latency, which is like actually causing poor performance for actual users of an app or a website or whatever. Yeah. Yeah, yeah. Yeah, that's, it's interesting that those garbage collection spikes are, to some degree, related to one of the major principles of writing.

01:26:11
ridiculously fast code that became clear in my mind some time ago through working in C++, which is avoiding unnecessary memory allocations, certainly inside tight loops, try and move all the memory allocations outside of those loops, reuse memory rather than deleting and renewing it. That has design implications, but if you've learned patterns for minimizing that kind of thing.

01:26:38
That just automatically makes all your code so ridiculously fast. It'll blow your head off So so I think what we're saying with it with a garbage collector you get related problems like that Where you're creating too much garbage by failure to follow those patterns and some of the solutions are slightly different But it's very closely related Yeah, I mean where I've heard the garbage collections bike thing is teams that have

01:27:05
already optimized the hell out of the language that they've got now. And like, that's the only thing left. That's like an irreducible problem of the way, the way that particular language works. Um, but yeah, just to back out about, I think what we're circling around is kind of this in terms of choices of languages, like particularly in the web services kind of arena and SaaS arena, where there's a lot of, you know, production web services that are running 24 seven that need to perform well, that need to not error very much.

01:27:35
There is a push for a single language for a team, which is a tug of war that's not resolved yet. There are advantages to being single language across, which might push you towards Node.js, because then you can have your JavaScript front and back end, but then you've got app problems. You can use other languages and then lean on the WASM, the WebAssembly, to push out to the end, potentially.

01:28:05
Rust has an advantage in that it can be compiled for all of the platforms out there, Android iOS, Linux, Mac, Windows. So it has the potential, but you still have the Wasm overhead in the web browser. Other than that, it's a pretty good place for some core logic.

01:28:35
how your teams are structured, how your systems are structured, like just your sheer scale, like can you afford it? Do you have to have your programmers come in multi-skilled? Can you have silos of these people know this, those people know that? Does that work? OK. So, yeah, I don't think there's like a definite answer here, but I quite like the fact that we've talked around this and I think that in itself that there isn't like a clear cut answer that, you know, it depends a bit.

01:29:04
There are some trade-offs to be had that Russ definitely brings some advantages, but we're not going to see it turn everything on its head necessarily. I think in the medium to long run, it will turn quite a few things on their heads, but with all historical changes that we live through, it's hard to notice as it's happening. It's only now when I look back at the eighties and it looks like the sixties used to look that I see how different the world is, but I didn't really notice it while it was happening.

01:29:33
And yes, of course, there's no one answer for all organizations. I suspect that very small organizations are gonna suffer from the kind of single skill set problem, much longer than everyone else, because the simple financial incentives are lining up for big organizations to do this multi-skilling and...

01:30:02
to trade off the cost of skills diversification against the bigger cost in a lot of cases of not making maximal use of web server resources. So the argument's simple for those guys. They can afford the overhead of all the HR and other overhead of having specialized staff in different silos. It's a small organizations trying to, you know, your 10 person company trying to do something new.

01:30:29
that's always gonna have a problem affording that kind of diversification in skills. And it might just mean that the pattern for the future becomes well, if you're doing something new and disruptive as a tiny startup, you're just gonna have to live with a degree of inefficiency on your web server that you can do away with later. Once you've made the case that it's worth doing away with by being successful enough as a company and growing. Yeah, I mean, there's maybe some extremes there like

01:30:59
Ruby on Rails is quite good for prototyping out a new SaaS idea. Ruby on Rails, stick it on Heroku, bam, see what happens. On the flip side, maybe it's gonna make sense for small organizations to write in Rust right from the get-go, because actually, they can support a ton more users. It's a perfectly good modern language. The developers love it.

01:31:23
Why not? You know, it doesn't have the pitfall of C++ of like, you know, if I picked up C++ and tried to do it, my web server would be crashing all the time. Whereas, you know, if I pick up Rust throughout my first microservice for some, you know, SaaS business idea, actually I'll probably have something super efficient on cost, super fast, super responsive, low latency. There's a real good chance that, you know, if I get my database calls and my disk stuff and the way I handle it, decent. Depends a lot on ergonomics.

01:31:51
Using rust for the web which is something I know very little about and I know it's something where Golang has a bit of a head start Yeah, yeah, I need to get more into that. That's kind of my next Probably my next foray like in terms of talk ideas the dependency injection is kind of the next one But it's an essential part of that, isn't it? It is yeah, yeah But having talked. Oh, yeah, something I haven't even mentioned is the the rust meetup I went to the other day talking to people there

01:32:22
one of the gaps in kind of the conference talks is like, you know, how do you build backend microservices in Rust? Cause what was mentioned, a lot of the talks go like way a bit like we have been deep into the deeper, you know, intricacies and performance and what have you, like maybe off into compiler stuff, which is, which is cool and interesting in its own right. But it's not really where the, you know, the, the big middle of the hump is in terms of volume of usage. Like, like I see this in C sharp is

01:32:51
like most of it's, you know, microservices that do APIs that then integrate with other stuff and produce business value. That's like, that's where the volume of coding is. That's where the money is for all this stuff. You know, the edge cases are perhaps more interesting, but so yeah, maybe I'll do a talk on, you know, figure out how to build these things and do a talk on kind of the state of the art of that. And that hopefully will fit really well with the Rust Workshop as a business. Cause I think there's a good chance that that's where the bulk of the business will be. Yeah. I think what you've been talking about is

01:33:21
the edge cases, some of the deeper technicalities. It's a bit like why we do pure science research because you can't, the most efficient way to invent a flying car is not necessarily to try to invent a flying car. It might be an incidental discovery and pure research in some other area that you only get because you're funding abstract pure research that makes the flying car happen quicker. And so kind of from that angle,

01:33:50
that those, that technical stuff is in the long run, deeply, deeply enabling because it changes the performance reliability and complexity trade-off of the language. Things like improving the borrow checker to be less stupid in some important way that allows certain programming patterns to be used.

01:34:15
in Rust that you couldn't otherwise and then has knock-on implications for efficiency. That helps everyone, but it happens on longer time scales and it's largely invisible to the people who are doing work-a-day web development. They just get the benefits of it down the road and they think it appeared by magic, but it didn't. Yeah, yeah, lovely. Cool, I think I'm just about covered everything I wanted to cover. There's one more thing, just like, this is totally out of flow, but I'll mention it anyway.

01:34:44
which is one thing about actual day-to-day cargo rust development that's been interesting to me, is like having a couple of laptops and hopping on and off trains and bad internet connections. That's been an interesting thing for me. I've got my laptop synced and one thing I've learned is you have to run a cargo build on the machine that you're about to take away when you've got good

01:35:14
I made the mistake of jumping in the car with a laptop. I had the machine synced before I went, but I hadn't run a cargo build. And I got there and ran a cargo build. And it's like, downloading crates IO. Oh no, I have not got the internet for this. Damn, I can't run a build. Yeah. I think we discussed this previously, didn't we? The ridiculous data requirements of cargo. And funnily enough, about a week after we had that conversation, you sent me something.

01:35:43
about how they're looking for opt-in volunteers for a new protocol. They are, yeah, yeah. It's gonna get better, but on a four kilobit, 2G network connection. Four kilobit, 1970s called. Yeah, countryside. I go to places that are a bit further away from mobile phone towers sometimes. Ah, yes. You see, the infamous holiday park Wi-Fi problem, isn't it? Yeah.

01:36:12
Yeah, there's fiber running down the road, but I can't have it. Yeah. Yeah. And it's just the current state of things. Um, so yeah, that was, that was an interesting experience, but generally it's been pretty good. Like as long as I remember to run a cargo build and maybe do my upgrade and update before I set off, actually, you know, doing this stuff on the move. I've mostly needed my web access for, you know, learning how to do things or playing with chat GPT, which is pretty, pretty efficient. I was able to do that on a, on a train ride. And.

01:36:42
still to and fro messages with the chat GBT, which is cool. It's because it ain't running locally, that's why. Requires GDU's on the other end to run it, I think. Yeah, it does. Yeah, it's just text over the wire after that, isn't it? Yeah. Which is pretty efficient. So yeah, that was kind of it for my update on all of the Rust things. Was there anything else you wanted to cover before we? I could go through just a brief list of...

01:37:11
a few technical issues like go through my greatest hits of things that annoy me and rust or things that I find interesting if you like. Yeah, yeah, because for me, like rust has mostly been upside, it's mostly been like, it's taught me a bit more about like where my stuff is living on heaps and stacks. I'm loving the result thing. I'm loving the lack of exceptions, like lack of nulls. That's all cool.

01:37:37
It's a nice modern language. It's like, I'm really enjoying that. So it's be really interested to hear like, you know, from, from someone from the C plus plus side, like what, what were you taking from it? What was frustrating you? What, what do you think? Maybe it falls short in, in, in. Okay. Compared to what you've done before. Go for it. Yeah. Okay. Well, I already mentioned one of those, uh, which is the borough checker. Um, and I came across this while I was trying to do that. Um.

01:38:07
library transformation for the RotKracken rewrite. There's one obvious optimization when you're traversing file system trees that you can do, which is you allocate a single string buffer, guaranteed to be long enough for any path you're gonna handle. And you, as you traverse the file system, when you want to notify the client of a file, you just add the file name.

01:38:37
to the end of the path, and then afterwards you remove it. If you wanted to send into a directory, you add the directory name to the end of the path, and then recurse, and then you remove it afterwards. And what ends up happening is that you just spend all your time, all your string manipulation involved in being able to tell the client what each path is as you're visiting the items, is all happening in one buffer, live, by rewriting the end of it, and keeping track of...

01:39:04
where the slashes are effectively, so that you can push and pop things off this thing that's kind of half string, half stack, that is your current path buffer. So... So you talked about having like a series of pointers to every slash in your string, and then you can just pop them off. Yeah, so you've got a top of stack. If you're doing it in C++, you've got a fixed size buffer of, let's call it four kilobytes, just to be ridiculous. And you've got a...

01:39:34
a where is the end of the path pointer, but that pointer happens at each level. It exists at each level. So at the root level, you just write the start of the buffer, overwrite the start of the buffer with each file or directory name in turn as you go, and you record the end of path pointer as being at the end of that item. And then when you recurse into a directory, you pass that, you pass the buffer,

01:40:04
and the pointer effectively down to the next level. And then that can append something past the end of path pointer, but it can also then pass the entire buffer to the consumer to process the entire path. So you can imagine a recursive programming pattern like this that just is furiously rewriting string data inside one buffer, but it's not doing any copying and copy. So no allocations. No allocations beyond the initial one. So.

01:40:34
that checks off two of my very few principles of lightning fast programming. Avoid unnecessary allocations, avoid copying. And this is partly how you got the very fast tree traversal? Well it's not because the problem I had was that when you're doing stuff like that, what you really want to do is you're building a stack of context data structures as you go

01:41:03
Ideally you want to implement that stack using the program stack. And the way you do it is to create an item in the current function, pass a reference to it to the recursive call to the same function, and then have that create its own item based on the one you passed it, which is now its parent. So the idea is you create a local instance in the function that has linkage to the parent level.

01:41:32
And in effect, through that linkage, the program stack also contains your stack of context structures, and you haven't had to do any heap allocation to implement it. So it's just a cute trick that you can do in things like C++, where you've got a recursive algorithm, you need a stack of contexts for that recursive algorithm, and you have a program stack already, so you just leverage it to become the stack of contexts that you also need.

01:41:59
Okay, so where did you run into the borrow checker? So basically, reusing the path buffer, it was only really practical while at the same time having this stack of context objects because I could have done this in Rust if all the context had been in the form of parameters passed to each function call.

01:42:26
And then each recursion just passed a bunch of those down and altered others to the next call But it's a really cumbersome way of doing it You don't want parameter lists that are 20 items long that doesn't make for readable code So you want to create the you want to define a struct that contains this encapsulated context that you can Initialize as one object locally pass a single reference to that when you recurse and then have it create its own instance And so on

01:42:56
And that was where I really came into conflict with the borrower checker because this path buffer is an external thing. And if you think about it, if you're passing a reference to the path buffer down the chain of a recursive algorithm, you end up with multiple mutable borrows because every level needs to modify the thing. And that alone you can get away with in Rust because Rust's borrower checker understands that yes,

01:43:24
in this invocation of function foo, it's got a mutable borrow to this object, but during a recursive call, that mutable borrow, although it's technically an alias, and although the recursive call, you're passing the thing to the recursive call, you're reborrowing it, although technically you've got two mutable borrowers of the same thing, the one in the caller is inactive because the caller's code is not running. So,

01:43:53
in Rust you can just remutably borrow the same object down the call stack of a recursive algorithm like that. But it's when you want to start taking those references and putting them into context structures that it becomes a problem. Because then the borrow checker doesn't understand that subtlety that the previous data, the parent level, although it has a mutable borrow, is totally inactive.

01:44:22
for the time that this recursive call is running and therefore it can be ignored. And so you start running into all kinds of problems of the borrower checker saying, no, that's multiple mutable borrows. You can't have that. You can't reborrow that. You start having to use- So you're saying that the, if it's the, the buffer is just a parameter on its own, it works, but if the buffer is a field on a struct, then it doesn't get it anymore.

01:44:51
then you start having to try and use lifetimes very heavily in order to illustrate to the compiler the relationship between a struct for a subdirectory and the same struct for the parent directory. When you're trying to create the struct for the child directory, you're gonna create it partly based on a reference to the parent struct, right? And at that point, you've got to remutably borrow the path buffer.

01:45:17
the child struct still needs to know about the whole path buffer so it can pass the whole thing and say this is the path to the consumer. But because the mutable references are inside a struct you then have to use lifetimes to say this new function that I'm writing for this struct that takes a mutable reference to a parent structure.

01:45:45
has is is guaranteed this instance is guaranteed not to last as long as the parent instance. And one of the things I ran into there was that the Rust borough checker doesn't seem to make enough distinction between cases where things live for the same lifetime and one thing lives no longer than another. So what happens when you try to do things like that is that you try to

01:46:15
function new and you'll parameterize it with lifetimeB colon lifetimeA which means B lasts at least as long as A I think if I remember correctly and so what you do then is you say that the return value from this function is the child struct with lifetimeA and the parent that it's getting passed has got lifetimeB and that way you express

01:46:42
that you're creating a child struct that will live no longer than the parent. Which makes sense if you're going deeper and deeper in these paths, because the parent path, you're processing in the longer term, but you dive into one of the child paths, you process it, you come out again, you process it, you come out again. So these are the this will definitely live shorter and you don't come out of the parent path until all of those child paths are done.

01:47:08
Yeah. So that satisfies that. Okay. Exactly. Exactly. So where did you run into it? So that kind of sounds okay, albeit a bit fiddly. Yeah, it is very fiddly and it involves a heck of a lot of typing and it was quite annoying. But in terms of the syntax of the lifetime system, it was expressible. The problem, one of the major problems I tripped up over was that rather than throwing an error,

01:47:37
How can I put this? Rather than what you'd expect, which is for the lifetime part of the borough checker to...

01:47:48
accept that and act accordingly, what'll often happen is that that child, it will treat that child lifetime not as long as the parent lifetime statement, as actually a statement that the parent lifetime must be extended to the child lifetime in the calling function. So it kind of interprets that the wrong way round and extends lifetimes in calling contexts rather than enforcing

01:48:18
that the callee contexts are definitely shorter lived. And so this is where I talk about the difference between the same lifetime and a lifetime that's longer than another lifetime. Because in practice, even when you express that B is a lifetime that contains lifetime A, what the borrower checker actually ends up doing with it is just making A and B equivalent. And that causes problems. Because that means, for instance,

01:48:47
in a caller in this algorithm that is iterating multiple subdirectories, you end up running into what's known, I think, as non-lexical lifetime problem case three, which is where you're in a loop. And there's a lifetime statement, a lifetime constraint involved in a function you're calling in that loop that's to do with its return value. And because of that lifetime statement

01:49:15
that return value is effectively extended in lifetime to the end of the function. It's extended outside the loop, which kind of makes a nonsense of things because you definitely need that return value to die at the end of the loop. That's the only sensible thing. And so, and then you, as a consequence of that, you end up running into the ridiculous consequence of the compiler complaining that because you're calling a function inside this loop,

01:49:43
that returns an object that has a lifetime relationship with its parameters, that therefore any of the borrows you're doing as part of that function call get extended to the end of the function in the caller, which means that the next time around the loop, you can't borrow the same object. Even though everything is tightly scoped to the loop and nothing escapes the loop, the lifetime checker will pathologically extend lifetimes outside the loop to the end of the function.

01:50:10
in such a way that the loop itself won't compile because it then thinks that you're multiply mutably borrowing the same thing when in fact, as you've written it, it's clear that the lifetime of the mutable borrower is limited to the loop iteration. So I, when we were talking about this before, I didn't follow it to the extent I just did now. And I glibly said unsafe. So you.

01:50:39
I think you said that you can't turn off the barge echo with unsafe. Is that right? I've I have in my recent adventures, thoroughly dispelled myself of the easy assumption that I think everyone makes about unsafe, that it's a get out of jail free card that just makes all the rules and regulations go away and you can basically write C plus plus with different syntax. That's not true. If you look at the docs, it clearly states that there are five things you're allowed to do in unsafe rust that you're not otherwise allowed to do.

01:51:08
five things, five very specific things. So actually it's no kind of, well, I can look it up now if you want, but. That's fine. What bits related to what you were doing or didn't? Absolutely nothing, which is why it wasn't a solution. With unsafe you can. So it basically didn't solve your problem. No, with unsafe you can call unsafe code. I know that's a thing you can do. That's obviously something you'd need to be able to do. You need to call the function marked as unsafe.

01:51:37
inside an unsafe block, otherwise what's the point of having unsafe? And you can do, what's one of the other things you can do? You can, I can't easily recall any of them, but they're all very specific technicalities. It's a very limited list of things that does not include in any way arbitrarily overriding any borrower checker rules, with the exception of maybe one.

01:52:05
one thing you can do, something like mutating statics. I think you can do that in unsafe code, which you can't normally do. That's the only thing really that relates to compromising the borough checker on that list as I recall, which is interesting. Yeah, that is interesting. And as you say, eminently look upable in terms of actual details. Okay, so where did you end up with this? Did you give up?

01:52:34
on that model of a single buffer? I gave up on it. I'm not absolutely convinced I couldn't have made it work, but I think even if I could, the ergonomics would have been so horrible in terms of being able to succinctly notify the consumer that here is this item at this path. I would have had to modify the interface in all likelihood to avoid the need for creating these temporary structures to the point of even getting rid of the...

01:53:01
temporary structure encapsulating the directory entry that you give to the consumer. And at the point where you need to bastardize the interface that your consumer is using that badly in order to make a light, get an efficiency improvement in a library, I just thought this isn't worth it. I've spent a fair bit of time on this. And frankly, it's time to cut my losses and be thankful for the performance I do get from Rust. And...

01:53:29
maybe admit that the borrower checker is not quite fit for consumption yet in a general sense. There are things that it really should be able to do that it can't do, such as this non-lexical, such as this arbitrary extension of lifetimes in callers of functions that have lifetime specifications. That doesn't really fly in my mind. And I saw references to a more advanced borrower checker prototype called Polonius that's been going for a while.

01:53:57
Polonius is apparently the guy who said neither a borrower nor a lender be. So another clever pun for tech geeks. Very good. But so this non-lexical lifetimes problem three is an interesting one because I believe it's that case to do with loops and the multiple mutable borrows error that you get.

01:54:24
when lifetimes interact badly with calls inside loops. And interestingly enough, if we recall the original borough checker, I don't think had a name or I've never found out what it was, but in recent years, the non-lexical lifetimes upgraded borough checker came about, was shaken down, and is now the default borough checker, as far as I know, for modern versions of Rust.

01:54:52
The non-lexical lifetimes borrower checker originally solved problem case three. I don't know whether it's originally, but there was a point at which it solved this problem case three, but it was actually deliberately regressed at a certain point to not solve it anymore and to reject programs like that because the performance cost of dealing with those code constructions was deemed too high. Right.

01:55:19
So Polonius, I believe, is the new thing that can handle non-lexical lifetime problem case three, okay, and in fact can even do so with reasonable performance, which is very encouraging. And that naturally leads to the question, well then why the heck isn't it the default borrower checker already? And as far as I can tell, the answer is simply that there are multiple recently logged bug reports with Polonius.

01:55:46
it just isn't mature enough to mainline it, mainstream it as the default borough checker. So I really hope that in the next six months to a year, that'll get shaken down, we'll have another borough checker change to Polonius. And some of these code topologies that really should be possible will become possible again. But we're just not there yet. It's just a bad moment for it, I think. Yeah, yeah, this seems to be the journey of Rust. And it's one of the reasons I think it's a good language to bet on is that.

01:56:17
know, the further back in the history of Rust you go, the more of a like toy language it was, you know, it was a very single-purpose thing and it's just grown and grown in its capabilities and use cases and this is just another, like from my point of view as a C sharp program, a super edge case, like oh you can't do this very obscure thing I would never have even thought of. Yeah. And it's, I think it's really interesting that

01:56:41
Like and yet as a C++ programmer that stuff annoys the crap out of me because I can do it in C++ and I had to make the conscious decision. Okay, stop obsessing over this optimization. Take what you've got. Be thankful for it and just move on and continue building the product. Yeah, I think that's a really interesting divide between like the C++ programmers and the C sharp programmers who are coming to rust because you know C sharp programmers generally their idea of optimization is like.

01:57:10
Am I doing network calls in a loop or nested loop and C++ programs? It seems to be much more normal to expect to be able to optimize, you know, location in memory kind of stuff. And that's kind of the bar. So it's, it's interesting to me to see kind of where we land, you know, with this Rust workshop business and what kind of projects we end up doing. Um, cause, cause there was definitely a trade off here and it's kind of good that these two camps are coming together and talking.

01:57:39
because either extreme is kind of not great. In the C-sharp extreme, what we end up with is like, yay, we've written this entire business in C-sharp. And what do you mean you don't like the Azure hosting cost? Yeah. And then on the flip side, you certainly can, and I'm not saying C++ programs do, because they're all intelligent people, but it's certainly possible to optimize when it didn't matter for the business.

01:58:09
And you know, you show constant peril of C plus plus programming. Let's be very clear. It attracts the kind of obsessive freaks like me who like everything to be as fast as possible and we will tend to waste too much time on that stuff. Yeah. Yeah. So it's really interesting to see that, you know, you, you got a significant speed boost just from rewriting to from Pearl to rust that you've seen a while I in C plus plus, I could have done this optimization.

01:58:39
Okay, can I do it? Well, not currently. All right, well, I've learned a bunch of stuff very like your way ahead of me and understanding of lifetimes now in Rust off the back of this like certainly what we've been up to is fantastic learning fodder, like much more than the you get from reading the docs as it were. Yeah, so yeah, interesting stuff. Yeah, I think it's

01:59:02
C++, there are some great things about C++. One of them is that it sets an extremely high bar for power. It is the ultimate language for power of being able to express exactly what you want and getting that. Well, exactly what you want the machine to do specifically. Yeah, yeah, yeah. And especially what attracted me to it at first was the idea that

01:59:32
I could, it was expressive enough to do decent, vaguely readable design at the same time as getting that ultimate power to shape how the algorithms are working at the lowest level. That trade off was one. As opposed to C and assembler where it was very hard to do anything high level in terms of meaning. Yeah, yeah, you can do exactly what you want in both those languages, but good luck reading either of them compared to C++.

02:00:02
Even I get annoyed even with hardcore C programmer types who in my mind constantly come up with excuses not to use C++, which is basically a superset of C. Because they don't like X or Y. And the answer is always, well, don't use X or Y then. But there's no reason to chuck the whole language in the bin. Because it's just, it gives you more than you get with C. But yeah, even, even, you know, one of the, one of the reasons people like exceptions is that C code

02:00:32
C doesn't have exceptions. And as a result, C code that wants to handle errors becomes extremely verbose. And I think you see some of that verbiage in things like Golang. And one of the great triumphs of Rust is that it's taken that basic approach and added some very well worked out syntactic sugar to minimize the degree of verbiage if you know how to write it. And that's something I discovered dealing with results and errors.

02:01:01
is if you write it correctly with the right topology and structure, you can get something that mostly looks like a C++ program without too much extra garbage in there, cluttering up the readability, and yet has a C-like or better than C-like degree of explicitly anticipating every possible error. Yeah, it's really quite beautiful. Like in C Sharp, one of, as a library, is gaining popularity in Mindshare,

02:01:31
kinda does what result does, but like all of these things which are an art, they're a retrofit, in this case it's a library that you use in the language, you very quickly run into a couple of things. One is if you just use it as default you end up with some quite nasty stuff, they've got this like as t0 as t1 that you end up having to call where it's effectively positional and

02:02:00
You know, in the project we were just on, yeah. We wrapped that into- Too much use of tuples by the sounds of it. Kind of, yeah. In the project we were just on, we ended up wrapping that in one of the microservices to something that ended up being really tidy and good for the purpose of that particular microservice. But ended up with a problem where we had 10 teams with five different ways of solving the same problem and basically...

02:02:29
couldn't come to an agreement at that point. Like what was, you know, should you use that one-off bear? Should you wrap it in something? Should you do something that doesn't use one-off? Whereas in Rust, there's no argument. It's like, well, here's the answer and this is just what we use it. So we weren't really able to get all of the benefits. And then, you know, if you've got new get libraries that you then share between projects, which we did in that one, you know, do you expose the one-off stuff or do you expose your own custom thing and are you forcing approaches? Yeah, it's a little bit messy.

02:02:59
So yeah, super nice to see the way that's been solved in Rust. Yeah, yeah. And so just to finish your thought on being a C++ programmer transitioning to Rust, when you've been doing C++ for 20 years, paying attention and trying to come up with programming patterns that maximize reliability and performance at the same time, it's easy to be blasé about how fast Rust is. And that leads to...

02:03:26
focusing too much on these minor complaints about the borrower checkers deficiencies and things like that when the real perspective is you've written entirely safe code that's missing whole chasms of crashing bug potential. And it's basically within 5% or much, much less of the speed of an equivalent C++ program. What are you complaining about? Shut up. Yeah.

02:03:55
I mean, the borrower check are not being fit for purpose. That's fighting talk. That is, I mean, from, from a C sharp programmer, you just get a blank look. It's like, yeah. Yeah. Yeah. I like, this is, this is worlds I'd never even thought of and they look kind of safe. I'm happy. Yeah. Yeah. It's, it's, I hesitate on one level to make a criticism that strong because it is a strong criticism, but having had enough experience with trying to do sophisticated things with lifetimes.

02:04:22
I think it's a justifiable statement. I think the, I think for people like me to.

02:04:30
get over to reduce the frustrations with Rust to a degree where it's a total no brainer to transition from C++ to Rust. That kind of thing is gonna help. But there's an argument that I'm such a nerd to begin with that the fact that I would even care about that is just ridiculous anyway. So fair enough. But I think things like being able to do a mutable borrow in a loop.

02:04:58
with lifetime constraints on what you're calling without completely stuffing up the whole loop and making it uncompilable is a very reasonable expectation for someone writing software with some sophistication in the real world. So would you say that that's like, regardless of the C++, C-sharp kind of starting point, would you say that's one of those things where there's a bit of a trap that anybody could just fall into and get stuck in for a couple of days while they figure out how they fell into something?

02:05:28
that if you're just doing functional programming in Rust, using parameters and return values, you can probably do a recursive algorithm of this type as sophisticated as you want and even reuse a single path buffer if you wanna do that kind of thing. The problem comes in when you want to use temporary data structures to encapsulate some of that complexity and make it less of an ogre to read.

02:05:58
the 20 parameter problem that I mentioned. It's at that point where you start grouping parameters into a struct that the borrower checker starts going, oh, you're using a struct and you're borrowing from another struct instance in order to initialize this field of this struct you're just creating. It's lifetime time and at that point it all falls apart. So as long as you, there are certain uses of temporary data structures of this kind that are just pretty much,

02:06:28
They're pretty much totally ruled out by the current borrower checker. And that's the annoyance that would be nice to have addressed. So basically there's a bit of a pit of failure there at the moment. They might sort it out over time. And probably there's less pits of failures than they used to be, but there's still some more to be squashed. Yeah, yeah. And in this sense, one of the reasons I like Rust so much is that the language itself, its documentation, and the evolution that you can observe...

02:06:56
from year to year in the language tells me that the people responsible for maintaining and pushing rust forward are basically of the same mindset as the people on the C++ committee in the sense that they believe in evolutionary change, they reject utopianism, and they want to make the biggest impact they can and get things right and have the right design philosophy.

02:07:24
And that's a lot of reason to be excited for us, for me, because the reason I never considered any other language a serious contender to replace C++, mainly stemmed from things like utopian design philosophies that lead to silly decisions that are in conflict with the real world. And I'm very encouraged by the fact that this non-lexical lifetime problem, case three, is a thing I could find on the internet.

02:07:52
for a language that's still relatively niche. There are people out there trying to write loops that do mutable borrows with lifetime constraints inside the loops and hitting this problem. People are discussing it. People are trying to come up with practical ways to allow programs like that while keeping the language safe. So it all speaks to the right mindset. And...

02:08:18
And so in a way, if you're a C++ programmer moving to Rust, and it bothers you that some of these things don't work and that there are some of these tricks can't be used at the moment in Rust, it helps to be aware of that C++ is not just a language, it's a project that a committee has been working on for decades. And if you look at what that committee does and how they do it, they are pragmatic and evolutionary in how they do things.

02:08:46
and they don't let the perfect be the enemy of the good. And if you internalize that lesson, you can't be too harsh on Rust for the things it can't handle at the moment, as long as you trust that there will be an effort to address that later on. So I can say a couple of other general positive things. I was quite shocked by how when I first started the first Rust project, how quickly I became fairly proficient in

02:09:15
not writing on any code, but reasonably complex algorithms of the associative data structures and cycle them from first version to something that compiles in minutes. And even more shockingly, some of them actually worked first time, which happens once in a blue moon in C++ after a couple of decades of experience, but not very often. It actually became a more common thing in Rust. And the other thing is I keep finding myself in general.

02:09:45
asking myself, I'd really like to do X. Wouldn't it be neat if that were possible when I'm in the throes of Rust coding. And in general, it almost always is possible. And there's a way to do it. And the deep concern for performance and safety that underlies the language, not only underlies the language strongly, but it's permeated to the people who are writing crates for it.

02:10:14
and designing crates to do common jobs, because it's one thing for the language to do no move, zero copy rather, and safety and performance as paramount considerations. But one of the things I've been wrestling with thinking about C++ versus Rust is that really, all of that is convention. And if people write a bunch of crates that throw all of that to the wind and write the same crappy code that they might write as a new

02:10:43
C++ programmer, all those considerations in the language design haven't really done much good if a crate is still massively inefficient because somebody was pointlessly making tons of allocations and copying stuff around. But it does, the mentality does seem to permeate into the wider community and affect how crates are designed, which is very encouraging. And in general, when I'd like to do something that isn't possible, like with this shared path buffer or the

02:11:10
transitory data structures for the recursive algorithm. There's usually a good reason why it's not possible in the general case, even if it's frustrating that it isn't a problem in your specific case and you know it isn't, there's usually a reason why something can't be done. So, so that's all very encouraging. Do you have any responses to any of that? Or I can just list a couple of other things quickly. No, it's super interesting to hear.

02:11:40
how you've ended up feeling about it as a language to use more. No, I'm just, yeah, anything else you want to cover before we wrap up. Okay. Yeah. So, this question of what is at the root of the differences between Rust and C++ is one I don't have a definite set of answers to, and I'm still actively thinking about. It's a fascinating question because when you get right down to it and you think about it for five minutes.

02:12:10
it's easy to end up thinking that all of the differences that matter are convention. That there's very little inherently about the language that is better than C++ in an objective sense. So that's why it makes it a very interesting question, because obviously the whole safety, performance, complexity trade-off.

02:12:39
in the real world is massively different for Rust. And trying to trace that difference to its roots is a very interesting philosophical problem for me. So I'm gonna continue to think about that. But there are things like- Do you mean to say convention? Do you mean that all the things that are kind of the way to get things done in Rust, you can do it that way in C++, but you have to specifically choose to do- Exactly. Like the result type returns. Exactly. C++ 11 introduced move constructors, for example.

02:13:09
So people like me who were like, oh, holy cow, I've been waiting for this for a decade, thank goodness, and started immediately using it, were able to cover quite a few more of the use cases in their code where move constructors were the only solution, and generally being able to move objects was the only solution, and immediately start benefiting from the performance improvements that that brought.

02:13:38
Yeah, Rust. Tell me what a move constructor is, I actually don't know. Well, in C++ original, well, not original, but C++ 98, which is the one that's been around for the largest part of its history, I would say, or at least in terms of people times time, has got the most use. You have copy constructors, which are like a new function in Rust that takes another instance by immutable reference. And...

02:14:08
copies it. And the language can generate copy constructors by default in many cases where it can identify that the type is trivial and that just copying the bits will do the job. But there are situations that there are well-defined situations in C++ where the language does not generate a copy constructor because it's spotted that something that you're doing in your class hierarchy means that it's not that there's some unknown or ambiguity.

02:14:38
in how to copy object A to object B, some philosophical ambiguity in what it means for B to be a copy of A. Like if you've got a vector that contains a bunch of values, do you allocate a separate buffer for object B and copy all the elements from the buffer for object A? Or do you actually automatically have some reference countered buffer that you just copy the reference to when you're generating object B?

02:15:07
All those questions need to be decided. And those are situations where copy constructor automation gets disabled and forces you to make a decision. So in C++ 11, so there's a specific signature which is taking another object of the same type by immutable reference that is a definition of a copy constructor. And for years, I wanted to move things because I recognized that sometimes I'm done with vector A.

02:15:36
and I just want to move it to location B in some different lexical scope, where I'm going to use the vector from now on. And that if the code for the vector class was written only with a copy constructor, there's no real way to do that. You have the option of copying it wholesale with all the expense that involves, or you've got to throw the data structure in the bin and do something really clever and probably much more low level and annoying.

02:16:05
in order to get the effect of being able to move it. So a move constructor is like a copy constructor except it takes a mutable reference and the C++ standard dictates that the source object must be left in a destructible state, but does not have to be left untouched, which means that you can, your buffer pointer inside your vector, you can copy that into object B, null it out in object A, and as long as the destructor for that object,

02:16:33
is written in such a way that it expects null as a value of that pointer and ignores it and doesn't try to deallocate anything, then you've made it possible to move from object A to object B because of the move constructor grabs what it, it cannibalizes what it wants from object A and does just enough rewriting of object A to mark it as obviously not using those things anymore by nulling out pointers.

02:17:01
and then the destructor can notice that and just not bother destructing it normally and then you've got moves. So that's a perfect example of what you're talking about. C++ in C++11 made it possible to move objects, which has massive performance implications, especially for return values from functions that are hefty. So how does that relate to Rust? Well, the only difference is that Rust does move by default and it makes copy the exception.

02:17:29
And this is what I mean when I say that almost all the differences are in convention. It's possible to do. It's possible to do everything Rust can do in C++ and more. And yet at the same time, that three-way trade-off, I think is noticeably better in Rust for the most part. And I'm still trying to work out how much of that is convention and how much of it is very specific things like the existence of the borrow checker and how it enforces read, write, lock semantics on references.

02:17:59
that's a big thing. But again, I mean, that's a, that's really a way of preventing you from writing obviously incorrect programs, which could be seen as a form of convention. So it's a tough one to even define where the boundaries are with this kind of question.

02:18:22
So the only other couple of quick annoyances I wanted to mention were just around integers. So Rust tries to make every important bit of complexity explicit. There are things like the Dyn keyword that force you to annotate every site where you're using dynamic dispatch clearly. Dyn meaning I don't know what size this is gonna be until runtime. Yeah, yeah.

02:18:51
So in practice, most of the way you use Dyn in reality is you use a Dyn reference to pass a polymorphic object like a trait into a function. So I, although Dyn does technically mean, I guess what you said, which is this doesn't have a known runtime size, in practice, it often means I'm using dynamic dispatch on this. And the fact that you have to mark Dyn

02:19:20
much as you have to mark mute at every level where you're re-borrowing the same reference, means that all of the parts of your code that might be subject to inefficiencies due to dynamic dispatch are clearly marked at every level. And I guess this is a pattern that Rust picked up from C Sharp because C Sharp started this great thing of forcing you to annotate references all the way down the call chain.

02:19:48
instead of just where they're defined initially. And that's a great piece of ergonomics. So why did I bring up Dine? So yeah, so Dine is an example of forcing the programmer to write unnecessary characters in a way that produces much more readable code. And in that sense, I think some mistakes have been made around some of the integer handling.

02:20:17
There's no integer promotion or demotion. I discovered this quite early on with Rust. There's no automatic promotion or demotion. So like from a U16 to a U32 or something? Yeah, and it's those U cases that are really hard to defend in my mind. You're going from one unsigned type to another. There's literally no way the value can be mangled or truncated going from a U16 to a U32 in any way. You can argue that

02:20:45
going between, I guess it's the same with I16 to I32, that's also true, but as soon as you start moving between I and U or going down in size, that's where the possibility of truncating or otherwise corrupting the values comes in. But even in those guaranteed cases where the value will not be mangled, Rust doesn't automatically coerce the values.

02:21:14
So what would be what you would expect to happen and what's the impact of that? Well the example I can give is in one of my first Rust forays, this is actually my first Rust project but I never really got it compiling, it was just an exploration of writing some code. I was trying to do some prototyping of B plus tree data structures which was an evolution of some work I'd done in C++ and I'd got some working code there and

02:21:44
with B plus trees, the problem you have is that you have to define the data structure very tightly, especially if it's gonna be for an on disk format, because you've got a power of two block size on the disk, which means that you've got room for a power of two minus one slots in the disk block, plus a tiny bit of header metadata. And with those space constraints, what that means is that when the header has to contain

02:22:12
how many of my slots are actually used, like in a vector basically. You can't be storing that as a U size, because that's all your header space gone. You have to be able to store that in a U8 for example, in order to meet your space budget for a B-tree node.

02:22:33
Usize being the one that's whatever the architecture happens to be. Yeah. Usize being the equivalent of size underscore T in C++, which is defined to be a pointer sized unsigned integer. So in the case of that B-tree code, for instance, if you, your, your, your disk block, say it's 4K, it can contain 255 slots.

02:23:02
So it obviously makes sense to use a U8 for the how many slots are used number. What that means is that every time you index the slots in your code, you have to go, how many slots are used as U size and explicitly upcast the integer to a larger unsigned type in order to index any kind of array or slice on the basis of that number. So this is when you're reading, basically you're reading the UA into some other variable that was bigger.

02:23:32
And you know that that's never gonna be the same. It's not really another variable, it's just an on the fly cast. But the point is that in the indexing expression with the square brackets, you have to put index as U size in there. And your code ends up littered with that. Because the index is expecting U size, but you've got a U8. Yeah. And that's a completely safe operation because the U8 fits perfectly inside a U size always. Yep, yep. And also because even for some of the pathological cases where...

02:24:01
you've got where you are casting between unsigned and signed or vice versa and values can get corrupted. Rust does bounds checking in all builds of bounds checking of arrays. That's one thing they haven't compromised that where they haven't compromised safety for performance, even though there can be a compelling argument.

02:24:28
for not bothering with bounds checking. And C++ programmers often don't, and they rely on design invariants to make sure it doesn't crash. So, okay, fair enough. They left the bounds checking in at all times. But if the bounds checking is always gonna be there, then I'm not sure there's a very strong argument for outlawing using different integer types for indexing. And there's certainly no argument for not implicitly promoting smaller.

02:24:54
types to larger types within the unsigned and signed silos. So that's the kind of annoyance that when you're doing certain types of job in Rust is going to be a big annoyance and there's no real reason for it that I can, other than ideological, there's no pragmatic reason to disallow that. Now one of the things I've seen is that somebody suggested we'll just, why not make indexing able to handle integers of all different sizes.

02:25:24
as a halfway solution to this. And that's a good solution, but I'm sure there are other cases where not having integer promotion is probably gonna hurt the readability of code. So yeah. It seems like you end up with a bunch of visual noise in the code because of it. Yeah, yeah. And underlying all this is the pragmatic decision that the Rust designers have made.

02:25:51
not to really go near integer underflow and overflow too much. There are cases where arithmetic operations are checked, but it's pretty sporadic and it's not done in general. And I think that's a wise decision because that's a feature of the underlying hardware. C++ makes the same decision that underflow and overflow is your problem, at least when it comes to integers, because verifying this...

02:26:20
at compile time is impossible and verifying it at runtime is expensive because you use integers too much to make that worthwhile. So I think Rust made the right design decision there, but then for some reason it's fallen into the trap of being a bit ideological about making the program, forcing the program to be explicit about interconversions between integers in a way that is not useful 90% of the time and hurts readability.

02:26:49
That's my opinion at the moment. I don't know. I think the later it gets, the less likely it is to change. But I could be wrong. So it's just kind of an irritation when you're writing the code basically. Certain types of code. Maybe a lot will depend on the fact that you can now write Rust for the Linux kernel. I saw some reference to that recently. That provisions have been made to make that possible and easy.

02:27:19
um, in the infrastructure of the kernel that might end up solving this problem because people writing kernel code are going to be banging their heads into integer promotion issues all day long. And they're going to probably kick up a stink about it and force the language designers to get a clue. Yeah. And it seems to be the evolution of Rust has been driven by the evolution of use cases as they like adopt more and more things you can use it for. So that seems, seems eminently possible. Well, it's evolutionary change.

02:27:47
It's the only game in town that works as far as I'm concerned, so absolutely. That's why Rust is as good as it is. Anything else on your list? The only other thing is a related minor gripe about U size, which is, as I mentioned, that is basically exactly the same as size T in C++. And again, this is a bit of ideology that seems like it.

02:28:17
has probably has measurable performance impacts, certainly in memory use. The idea of forcing indices to be 64 bit on 64 bit builds, which is what size T and U size do, has always struck me as balmy, frankly. There are even 32 bits is such massive overkill.

02:28:43
for most areas of most programs when it comes to indexing collections. That's four billion items in a collection. That's a lot of items. It is big, yeah. And so to require those numbers to all be 64 bits is absolutely terrible, I think. You're constantly allocating 64-bit integers all over the place when you've got a collection that you know isn't going to exceed 100 items. What the heck is the good of that? So it bloats memory footprints.

02:29:12
I cannot answer that one. Yeah, it bloats memory footprints of programs which has knock-on effects for cache performance in the CPU and memory use, obviously. And although most modern CPUs have got rid of the differences in clock cycle counts between various operations on different sized integers

02:29:39
most of those operations now just take one cycle when you take the pipelining into account. Still from a memory use perspective alone and the resulting performance issues with the cache, it just seems utterly unnecessary. And this annoyed me enough in C++. You can't change this, that's it. Vectors are 64-bit on a 64-bit architecture. Yeah. And this annoyed me enough in C++ when its standard library defined size T and used it.

02:30:09
as that they conflated size with indices for a start. That's one problem because you might, I mean that's arguable either way. Maybe that's not such a big deal. This is turning into how to annoy a C++ programmer. Yeah, but in the C++ standard library, it's kind of annoying because it affects all the standard containers, but at least the basic language doesn't enforce that.

02:30:36
You can use whatever integer types you want for indexing raw arrays, for instance. The slightly more offensive thing in Rust is that precisely because it's a better language, that's more joined up in its design from the low level to the high. The U size, this problem of 64 bit U sizes ends up percolating to the most basic things like raw arrays, for instance, in a way that it's just totally inescapable.

02:31:05
um, and concomitantly more annoying. So I kind of, I didn't know enough or spend enough time to get that B tree code working and I'm kind of glad I didn't because it would have been a bit of a hassle for some of these reasons. Um, so yeah, so that's, that's, um, a couple of minor annoyances, having to use more use sizes than I would like rather than more constrained integer types.

02:31:35
Okay, so like, things that a C++ programmer is going to sigh at, but not necessarily enough to put people off using it in real projects or commercially, just like, oh, a bit more noise in the code, but like, this could have been more optimised, but really unless you're up against it, it's not going to be a huge, like, this isn't, like, make or break on a decision for the language. Yeah, and you can always...

02:32:04
right wrapper functions and also the vast superiority of Rust's macro system to C++'s probably would provide you with good work around opportunities for a lot of this stuff, because you could just automate that cast as a macro. And because the macro system is not a horrible kind of throwback based on

02:32:32
based on tokenized text processing, but is actually featureful enough to allow for a lot of the standard macros that you see in Rust, including things like print macros that literally don't allow you to mismatch your placeholders with your interpolation arguments and stuff like that. It's obviously much more sophisticated. And so for that- I believe it modifies the abstract syntax tree, and that's how it works. Right, right.

02:32:57
So macros in Rust are an area where I'm completely ignorant and have deliberately stayed completely ignorant. And I suspect that a lot of these annoyances will in practical terms go away once I learn how to use the macro system to make them not a problem, I suspect. That's for another episode, clearly. Yeah, yeah. Otherwise, just a final more positive thing on what is genuinely different between C++ and Rust.

02:33:26
As I've said, fundamentally, it's not a very long list. Probably a list you can count on one hand, if that. But one thing I noticed when I was getting into this file system traversal library, was a couple of revolutionary things around interfaces and polymorphism. C++ fundamentally conflates polymorphism and...

02:33:55
abstract interfaces. In C++, you can't write an abstract interface that says I'm defining a kind of struct template here that with these methods that take these parameters and have these return types, you can't really define that as an abstract thing in such a way that you can validate a concrete class against it. That's not how it works. Your only ability to define ironclad interfaces in C++

02:34:24
is via polymorphism. So you define pure virtual functions, which are dynamic dispatched function calls. And you say you can define what you call an interface in C++ or C sharp, which is a kind of an abstract base class that has slots reserved for all of these dynamic dispatched functions that are unfilled. And then you implement a concrete class.

02:34:51
and it fills all those slots with real implementations. But that confuses the polymorphism, the ability to override functions in such a way that the override is called, even if the reference is to a base class, it confuses it with defining the interface. Rust separates these concerns properly. So what I found- The traits. Yeah, so what I found with the directory traversal was that I could write an interface

02:35:21
as a trait for the consumer of the directory tree without committing to making dynamically dispatched calls in order to use that interface. And at the technical level, there's a more interesting detail of that, which is I think the implementation of V-tables, which are these static blocks of memory containing the slots with the method pointers, is fundamentally different in Rust. It seems like when you use a Dyn reference,

02:35:50
to pass a polymorphic object into a function, you end up with a kind of fat pointer, which is a struct pointer combined with the Vtable pointer that is sent to the function as a unit. In C++, the Vtable pointer is just an embedded field inside the struct that gets passed through a single normal reference.

02:36:16
And so that has all kinds of interesting technical implications. Um, but apart from that, just separating interface definition via traits from polymorphism means that you can do something I've always wanted to do in C plus plus, which is to specify abstract interfaces for classes that are going to be used concretely without any dynamic dispatch. That's a great safety feature. And I love being able to do that. And it also means that you can.

02:36:45
define an interface for something like this directory traversal and make a decision that like I did, that you're going to use generics as the means of using this interface for a concrete receiver, concrete consumer, and know that you can revisit that decision later and say, actually, I'm going to use dynamic dispatch and get rid of the generics and you don't have to change much code, which is very nice. Cool. So yeah.

02:37:14
And I have the feeling that this whole not embedding the V table pointer in the object, but just putting it alongside the object pointer as a fat pointer with two numbers in it, probably has some implications for simplifying object layout by the compiler that translate into greater efficiencies at runtime. But I haven't had a chance to think through that very deeply.

02:37:42
But it's an interesting, completely different approach that they've taken compared to C++, which I kind of accepted as the way you do it, because you can embed the function pointers in the struct. I've done that in C++, but that means every instance allocates all those slots. It's a waste of memory. So the Vtable pointer inside the object is more memory efficient. But I'd never really thought of that alternative of combining the Vtable pointer externally.

02:38:12
And I think that combined with the Java-like inheritance model that Rust has simplifies a lot of the gnarly details of code generation by the compiler in a way that probably has performance implications. So it's another thing I'm gonna keep thinking about out of curiosity, but yeah. I think.

02:38:38
The C++ programmers are listening carefully and the C sharp programmers like me are starting to get a little bit hazy Yeah, they're probably throwing popcorn at the screen by now, aren't they? C++ programmers talking again I Can see why C++ programmers and C sharp programmers don't really know how to talk to each other half the time and rust is this like interesting Melting pot where they're all coming together for different reasons

02:39:06
With what I've just been talking about C plus C sharp and Java programmers are a bit worse off than C plus plus programmers in that they don't have to think about this. Yeah, and that's because they inherit the model of C plus plus users for interfaces and polymorphism, but with even more kind of elided away automatically so that you don't have to think about it. So the slightly worse off than C plus plus people that way rust kind of makes you think about it even more.

02:39:33
and with a very nice separation of concerns. So yeah. Yeah, cool. Well, my rest of my life is in danger of injecting itself into this recording. So. I know, invasion. Yeah, I shall call it there. Yeah. Thanks for the conversation. I've enjoyed that. As usual, we've had a lot to talk about, which maybe this is the trade off for like not doing podcasts so often. We end up with longer ones, but hopefully some people will be finding this.

02:40:03
interesting throughout. I think it's been a pretty interesting conversation. It's really good to actually take some time to think this stuff through and think through a proper update on the whole adventures in Rust. Still hoping for big things for this Rust workshop. The next big thing on that is really getting the presentations out there and seeing what interest that drives and what kind of conversations that drive. It's kind of at that research stage at the moment of let's build some stuff, let's learn the language, let's...

02:40:32
get some brand and some conversations and just see what happens. We're going to be doing a C-sharp thing soon, next couple of weeks. I will chip away at the Rust thing but we'll see how much time I get to spend on that amongst everything else. Thanks very much Jim, thanks to anyone who's listening. See you next time on the Rust Workshop podcast. Until next time!

More episodes

Chapters

What is The Rust Workshop Podcast?