Swift Package Indexing | 42: We need a “No one expects the Spanish inquisition“ sound effect

Join us for another episode as Dave and Sven talk open-source security vulnerabilities and how all package ecosystems are at risk, why it won't be possible to give meaningful "package size" stats on package pages, yet more talk of interfacing with Swift from other languages, and a one-question quiz! Plus package recommendations, of course!

Follow up

Dead code stripping / Link time optimisation
- https://forums.swift.org/t/pitch-support-lto-for-swift/67379
- https://developer.apple.com/wwdc22/110362

News

Packages

whisperkit by Zach Nagengast
- MacWhisper by Jordi Bruin
Async-Channels by Brian Floersch
- Performance discussion on the Swift Forums
- swift-async-algorithms by Apple
KeyCodes by Matt Massicotte
- SFSafeSymbols by Frederick Pietschmann
Ignite by Paul Hudson
- Publish and Plot by John Sundell

Creators & Guests

Host

Dave Verwer

Independent iOS developer, technical writer, author of @iOSDevWeekly, and creator of @SwiftPackages. He/him.

Host

Sven A. Schmidt

Physicist & techie. CERN alumnus. Co-creator @SwiftPackages. Hummingbird app: https://t.co/2S9Y4ln53I

What is Swift Package Indexing?

Join Dave and Sven, the creators of the Swift Package Index open-source project, as they talk about progress on the project and discuss a new set of community package recommendations every episode.

Did you notice that we had a noticeable drop in traffic yesterday?

I didn't because I was out of the office yesterday.

Oh, that's probably why I dropped it. It was just that one unique missing.

Well, I did have to stop my script that just constantly hits our site.

Oh, right. Actually, that reminds me of a nice other episode. I was working on a project once,

and this was sort of an agency sort of setup. And we were working on an app and starting to

launch it, and they sent around an email, people in the office, please make sure that you disable

tracking so your traffic isn't part of the traffic that we record, the uniques and stuff.

And I thought, folks, if you need to filter out the 15 people you have in the office,

then your traffic is not worth tracking.

Yes.

Well, no, I didn't. So the short answer to your question is no, I didn't notice that we got a dip in traffic.

Can you imagine why that is, that we had a drop yesterday?

It wasn't the bank holiday. It was a very belated April Fool's trip. I don't know.

I'm pretty sure it was the eclipse because it's the US traffic that dropped off significantly.

Oh, everyone was outside.

Yeah, they were all sun gazing.

Right. Yeah. And because we could only, there was only a few places here that could just see the

tail end of it, right?

Oh, right. Or did it hit the UK? Because it certainly didn't hit mainland Europe.

I think it just, it hit the top corner of Scotland. And then I think in the band where I was,

it was supposedly visible for about 30 seconds or something like that.

But yes, that makes sense.

And how much of a percentage drop was it?

I did. Well, you see the bump notice, it's noticeable. I think in the US traffic,

it might have been, don't quote me, maybe even 50%. I mean, it looked like a weekend effectively.

And that's a weekend is typically 50% down. And that double check, like India had a higher day

than normal on Monday yesterday. So it was clearly, and India is the second highest source of traffic,

interestingly.

Just, just,

Just barely behind the US.

Well, so luckily, it doesn't happen again for how, it's like another 30 years or something.

I, well, I mean, I think there's, there's certainly other ones happening. I just don't know where. I think Australia is next at some point.

They're quite, quite frequent. You should ask ChatGPT and it'll give you a confident answer.

At least it'll be confident.

Well, okay.

Well, I'm glad it's recovered today, at least.

We also have some follow-up from our previous recording.

And this is from Mr. Anonymous, a person who's been really frequent in their follow-up.

And I really enjoy that.

I really love getting the follow-up so that we can at least in hindsight make sure that

what we talk about is more correct than the first time around.

So anonymous, in fact, that I don't even know who it is.

And last time we talked about this project, this package to measure the code size in packages.

And the idea was to get a feel for how much your app or library will grow if you use a

certain package in your app or library.

And we were discussing, you know, could this be an opt-in mechanism and so on.

And this anonymous person said, actually, that won't make a lot of sense because Swift

packages are statically linked and dead code stripped.

So you effectively only pay for what you use pretty much.

Right.

If everything works as planned.

And I think there's some changes, still ongoing changes.

So this isn't quite perfect yet.

But it certainly has been improving.

And this will certainly also continue to improve.

So it'll always be.

If we were to do this kind of thing, I think what we'd be measuring is effectively only

the worst case scenario.

And that's a very misleading figure then, you know, because if you used everything

a package had to offer, that's what you then would pull in.

But, you know, no one ever does that.

So publishing that metric would be pretty useless.

Yeah.

I hadn't thought about that.

But it's obviously in hindsight.

I think in the JavaScript.

I quite like the name of it in the JavaScript world.

They call it tree shaking.

Anything that's not attached to a branch falls out.

Yeah.

Oh, I should also mention it's a bit murkier than that because if you're using inlinable

and you have lots of generic functions, it can actually grow because for each inlinable

use of a generic function, you will generate more code on your end.

So it's give and take.

It's a really difficult thing.

And it's very specific to your use case.

So I think, you know, the message effectively is it's going to be.

We shouldn't do it.

No.

And I think the main thing there is that if you give a piece of information on a package

page like that, if it's a glanceable bit of information, it has to be fairly robust.

Yeah.

And anything that we could put there that might be misleading is.

Is something we should try very hard not to do.

Yeah.

I mean, exactly.

If it's so specific, I mean, there's really no real value because even if you were to

use it in different projects of use, you'd get different results.

It's not even.

Yeah.

I just don't see a way how we could possibly derive anything useful out of this.

Yeah.

It's the end of the feature, unfortunately.

But that's okay.

Because these things that you think about a feature.

Yeah, exactly.

That's how.

Yeah, that's how it works.

Yeah.

Exactly.

So one thing I wanted to briefly talk about is the issue that popped up.

I think it was last week or the week before, certainly after we last recorded, which was

two weeks ago.

And that was yet another supply chain ecosystem thing.

And that's the XC package backdoor that happened.

And the.

Well,

the.

Everyone has probably seen this, but just to summarize very briefly.

So there was a widely used package XC that's around compression and it's used in lots of other packages amongst them SSH and the Linux kernel.

And someone managed to sneak in code that can be used as a backdoor into this package.

And it was caught just before it was being widely rolled out into all sorts of packages and Linux distribution.

So that was a really.

Really critical thing that happened and was caught at the last minute by chance effectively.

I mean, it's kind of crazy that this didn't turn into an actual backdoor.

There's a couple of really good write ups of this that we'll put in the show notes.

Yeah, exactly.

And I think interestingly.

If you haven't kept up with it.

Yeah.

Yeah, exactly.

And I think it's interesting in two aspects.

Well, maybe at least two.

So one is technical.

So I think technically it's interesting how that was.

Actually.

Done because even even when you knew it was happening, it was really hard to detect.

I actually looked at one of the pull requests where where the disabling of the of one of the detecting mechanisms was was snuck in.

And I, I knew sort of what to look for.

I didn't see it.

I mean, also, I'm I didn't really understand all of the details of it, but there was a extra dot put into pull request, which is like impossible to see unless you.

really look at it.

Sure.

I guess it really depends a lot on how you look at pull requests and divs.

And if you were to use the normal GitHub mechanism to do that, you probably wouldn't see that extra dot.

So that I found that interesting.

The other interesting bit was that it really abused the weak spot in how open source software is being maintained.

And by that I mean that it's often sole maintainers that are working on things themselves.

And are sort of left to deal with everything themselves.

And it's quite easy to sort of gain trust, become part of the team and then inject code that way because it's that's the mechanism, right?

That was used here.

It wasn't someone sneaking a code review past someone or just solely that it was also someone being part of the team.

And that made it a lot easier to get that past watchful eyes.

Yeah.

And that's tricky.

The other really big one that's worth mentioning there is that a lot of solo maintained open source projects have this other problem, which is that the person who had the original problem and develop some code to fix their original problem, they might not have that problem anymore.

And so they gradually kind of drift away from wanting to maintain it at all because it's not actually a project that they use in their current role.

And so this is a real problem as well.

And so when somebody steps up to say, oh, I'll help.

Yeah, you're happy, right?

Yeah.

That can, you're very happy.

And that can be like, oh, thank everything that somebody is interested enough to help with this.

And it can be very tempting just to let it go too easily.

Yeah.

And it's sort of for everyone using that package, it changes under the hood, the trust that they initially had when they picked it as a dependency.

You sort of change it under them.

You know, it's like as if there's a popular website that gets.

Sold and then, you know, someone else is sort of maintaining it and you suddenly, you know, it's a different mechanism.

It's a very different, you don't have the same sort of buy in to your initial dependency pick that you made.

So, and I wonder how you could even protect against that.

I think that's a really difficult thing to deal with.

And money is really only part of it.

I'm not even sure if that maintainer was better paid, that would have changed anything of the outcome, right?

And that's the typical thing that pops up.

I quipped, I guess I wrote on Macedon, silly idea.

How about maintainers were paid a dollar per star per month from a fund that big tech companies pay into.

Right.

And of course that's silly, right?

Because money isn't the only problem as we've just discussed.

And it's obviously also very easy to game that sort of thing, you know, just have go around accounts starring the thing.

But, you know, I thought.

It was interesting to look at that as well, because in some cases I think it might help if people were able to make it their job to maintain open source, which in many cases it can't be right.

It's an evening thing.

It's a weekend thing.

It's a side thing.

And that makes it harder for people to actually stay vigilant and stay in quotes on the job, which isn't actually their job.

I think.

And it's something, it's a metric we use in the package index scoring calculation.

And the metric we have is how many contributors does a package have?

And we give a number of points on various thresholds of kind of one, or I think there's one for two and then more than five and then more than 10 or something like that.

And there's diminishing returns, but having that indicator of there is more than five.

And I think that's a really important thing.

How many stars across all of our packages do we have in the package index?

And I'll accept answers to the nearest 10.

Okay.

Nearest 10?

I thought you were going to say nearest thousand.

Well, yeah.

Well, let's say 20.

Okay.

So approximately just under 7,000 packages, probably an average of 10 stars.

So let's say 70,000.

Let's say 65,000.

Oh, you're miles off.

I'll come in again.

You're miles off.

You're two and a half orders of magnitude, I think.

Oh, dear.

Okay.

What was your figure?

65.

No, it's, yeah, it's one and a half.

It's one and a half what?

Now it's become a math question.

Oh, one and a half orders of magnitude.

Yes.

What's the number?

The number is 2,274,276 stars across.

Oh, I got that very, very wrong.

So the average must be much higher.

Yeah.

So I know there are a couple of packages with a lot of stars, which I suppose I should

have thought about those because I think some of them have like 30,000 stars.

Yes.

Yeah.

I think our highest has 25,000 or so.

Yeah.

And I think there are quite a number of those.

Yeah.

Okay.

So I think you would already get your 65,000 that you have there.

Right.

Okay.

I think there's also a long tail of-

Well, good job, everyone, for clicking on the stars.

Yeah.

I think there's a long tail of just stars on repositories in general.

I think it's quite easy to get a few hundred just because they never go down, right?

No one unstars repositories.

So it's really a continuously growing number on a repository.

That's true.

Yeah.

So I think there's a long tail of just stars on repositories in general.

I think it's quite easy to get a few hundred just because they never go down, right?

No one unstars repositories.

So it's really a continuously growing number on a repository.

That's true.

Yeah.

So yeah, there you go.

That's the quiz question.

Well, there we go.

I shall find an appropriate chiming sound.

Excellent.

Right.

Another thing we could briefly talk about is an interesting post I saw on the Swift

forums in the last week or two, and that's a post calling for the

Swift community to be able to use the Swift community to do a lot of work.

It's called Calling Haskell from Swift.

I saw this.

Yeah.

And this is a post by Rodrigo Mesquita, and it's a post about calling Haskell

from Swift, as the title says.

Another example of Swift in unusual places.

So last time, I think it was last time we talked about calling Swift from C#.

This time it's Haskell.

It's really interesting to see these pop up.

Yeah.

And it's using C interop in this case.

I think in the C# case, it was C++ interop.

This time it's C. And codable is used for argument sending.

Really interesting.

I think Swift is quite interesting and perhaps unique in that it has this strong C interop

and effectively allows all of these language integrations to be quite feasible.

Post is a very interesting read, if only to see how...

Haskell actually looks like, because it's quite a different beast.

I've never used Haskell.

I'm sort of aware of it, a functional language.

And it's quite alien to read.

You look at a couple of constructs that are quite difficult to decipher.

Without comments and annotation, you probably wouldn't know what's going on.

I took a quick look at Haskell many years ago, and it passed me by.

That language.

Yeah.

I did see a peek and a poke there in the example, which was really nice.

Threw me back to the C64 days where these were things it would do.

The other thing I found interesting, it's using macros on both the Swift and the Haskell

side, Haskell's equivalent of macros, to deal with the bureaucracy of this interop.

Because again, here also, there's a bit of setup that has to be done.

To marshal and unmarshal the arguments that are being passed through, and getting all

the calling set up, which is using C and unsafe buffers and all that stuff.

So there's a bit of bureaucracy that needs to happen to get this going, and macros can

deal with this, which is really nice.

And I think that's also something that wouldn't have been possible that easily in an actual

environment.

So this comes with a library that makes it then easier to do this interop.

So that's quite nice, and I wanted to give this a shout out.

I think just worth mentioning that I think the goal pretty much to use this is to use

SwiftUI, the UI layer in Swift, and then call out to a Haskell library.

I think that's, again, as in the C# case, the interest is to use the rich UI libraries

in the Apple platforms and use the backend in other languages.

So quite nice.

Absolutely.

And in not cross language, but cross platform news, I noticed something in the...

So we have this nightly job that looks at every package in the index and the dependencies

that that package has, and it uses those dependencies to discover packages that we have yet to add

to the index.

And I noticed, I think on last night's nightly job, that there was a lot of stuff that was

in the index.

And I noticed that there was a big expansion of packages from the browser company, who

are the people who have been doing a lot of work on Swift on Windows for their Arc browser

that's now in beta on Windows.

And I think before there was only one package that interfaced with the Windows runtime.

And there's not enough information in the readme files for me to be confident about

this, but my guess is that this is a splitting of responsibilities.

Because there are now packages for Swift UWP, which is...

UWP is Universal Windows Programming, maybe?

Universal Windows Platform.

There we go.

Nice.

And there's one for the C Windows runtime, and there's one for the web driver,

which is for doing Appium and Win app driver endpoints.

And there's the Windows SDK, which I think was the old WinRT one.

And so I have a feeling there's just a separation rather than anything kind of momentous happening

here.

But if that's not the case, then please fill us in because we'd love to talk about it if

there's more work going on there.

Yeah, that's interesting.

Right.

I've got one more.

And that is an app called Proposal Monitor, and that's by Victor Martins.

Also via the Swift forums, it was announced there.

And it's a nice little app too that runs on the iPad, the iPhone, where you can follow

Swift evolution proposals.

It's a bit of a sort of like a Kanban board where you can see what's in review, what's

been accepted, what's been implemented, and so on.

It has different columns and you can click through and see the proposals.

And that's quite nice.

I know there's something on swift.org, but I suspect I might use this.

I was just about to say there is a page on swift.org which also tracks this and has kind

of filters and various things that you can search the list and things like that.

And it is dynamically updated from the website.

So it is always the proposals website.

So it's always up to date on swift.org.

But of course what that doesn't do is it doesn't give you notifications or anything like that,

or arrange them in a Kanban board.

Right.

Yeah.

Is that data available in an API or in a JSON feed or something?

Is that where Victor is getting this?

Yes, there is a JSON file that it works off.

Yes.

I should know where that JSON file is.

It's in one of the repositories.

It's in a repository somewhere.

Right.

Excellent.

Well, that solves it then.

Start at the beginning of GitHub and I'm sure you'll find it eventually.

It's somewhere.

It currently doesn't have a macro.

It's a macOS app, but apparently that's soon to follow as Victor has announced in the thread.

And there you go.

Proposal monitor.

Give it a look.

I wonder if this app is using that JSON file because it's not hidden.

It is an open JSON file.

So it could be.

I would imagine so.

That's why I was asking because I don't, I mean, I wouldn't imagine he's scraping the

website or anything.

I mean, nothing wrong if he did it, but I think it's going to be much easier to ingest

a JSON file.

So, yeah, I think it's going to be a good idea.

I think it's going to be a good idea.

I can kick us off this week.

My first package is called Whisper Kit.

The company, I think it's made by a company, Argmax, but the primary contributor is Zach

Nagengast.

Whisper Kit is a package that will take the OpenAI Whisper speech recognition model, which

is freely available.

Even the models, not only the...

The code, but the model as well is freely available.

And give it to you as a package that you can use to interpret speech or to do speech to

text.

And there's a couple of reasons to mention this.

First of all, this process is incredible.

We've actually been using Whisper for probably about a year now to create transcripts of

this podcast.

So, if you...

If you enjoy the transcripts or if you were not aware of the transcripts, they are...

Every episode has been transcribed by Whisper and the accuracy is remarkable, really.

It's incredibly high quality.

It doesn't...

It's not perfect, but it's not far off.

Especially if you download the large model, which is one of the models that they make

available.

The accuracy is really very, very good.

We use an application called Whisper.

Is it called Whisper or is it called MacWhisper?

That's right.

MacWhisper.

Yeah.

Yep.

And that's by Jordi Bruin.

And MacWhisper is a Mac app that also uses the same model and the same code to do this,

but it's in the form of a Mac app.

And this package by Zach is in the form of a Swift package.

And...

Yeah.

It's worth checking out if you have any kind of speech to text requirement, I would thoroughly

recommend that you check out Whisper.

And this package might be a good way to interact with it.

I think there is also a command line tool for it.

So you could use it both as an SDK, but you can also use it as a command line tool if

you don't have MacWhisper or something like that.

There are several tools.

There are several tools that use this that are out there.

The other thing just to mention here is that all of the processing for the Whisper OpenAI

model happens on device.

So you're not having to upload your audio into the cloud and wait for a network connection

to bring you back the text.

This is all happening once you've got the model downloaded.

And the model is, I think, quite big.

I think it's a couple of gigabytes.

Yeah.

Maybe even a little bit more than that.

But once you have that model inside your app, you can transcribe as much audio as you'd

like to.

Right.

And I'm not sure if you know, but I guess, like, how would the package deal with the

model?

Is that something you need to invoke and download separately?

I mean, it's presumably not shipped with a package, right?

It's not shipped with a package, no.

No, it's not shipped with the package.

It's actually...

I did read about this in the readme file.

I think it's a little bit more complicated.

Yeah.

But if you call WhisperKit and give it some text to transcribe and give it a model that

it doesn't have, that call to transcribe will download the model.

Now, I think...

Oh, okay.

Probably.

That's probably not quite the right approach to it.

But certainly, there are methods in there to download the model so that you can make

sure you're ready before actually starting the transcription process.

Nice.

Yeah.

WhisperKit.

WhisperKit is great.

Yeah.

I've seen the results.

It's quite amazing, quite remarkable how well that works.

So yeah, it's the real world face of AI, this package.

Right.

My first pick is called Async Channels by Brian Floersch.

And this is an interesting package that's bringing the equivalent of Go's channels to

Swift, as the name sort of applies.

And as you can imagine, that sort of spawned a discussion about performance in a thread

on the Swift forums.

And that was also interesting.

Don't tell me you found a half second delay and realised that there's been a...

No, not that kind of a half second.

No, but if there's a different language, different implementation, sort of the first thing people

do is check out how does it hold up.

Yeah.

It's a nice collaborative effort there to see, compare the results and make improvements

because they are needed.

It's quite a bit slower than the Go original.

I'm not sure how much that is due to Go channels just being very well tailored to how Go's

Async model works and that might be part of it.

What's also...

So I should mention that.

Yeah.

I think the benchmarks are up to 10X in Async channel versus Go's channel.

So it's quite significant.

Quite significant.

Yeah.

I am not 100% sure how reliable the benchmarks themselves are.

I think they cover a few things, but I got the sense that there might be more work needed

on how to run the benchmarks.

And I think they're currently taking, looking at averages, which is... we've discussed this

in the past, isn't necessarily the best way or isn't great in finding actual real-time

performance.

I'll dig up the past discussion we had and this came up when we looked at the package

benchmark.

Right.

But what's also interesting, there is another implementation of this concept and

that's in the Swift Async algorithms package by Apple itself.

So this is there.

There's a title.

There's a type called Async channel, which does effectively the same thing pretty much

with a different API.

But this has also been benchmarked and that's slower than this Swift package.

So there is certainly room for improvement in that package, in Apple's package.

But I guess also in this new implementation, it's quite interesting.

And I tried this out a bit.

I think the API is quite nice.

So if Async streams are...

a bit nebulous to you, which they can be.

I think this API is interesting.

It's a bit clearer what you're sending and how...

It's using operators, so it has a couple of tricky bits around that.

But if you're sort of new to the concept and want to understand how that works, it's quite

interesting to set this up in a playground and play around with it a bit.

So that's the package called Async channels by Brian Floersch.

Bye.

There we go.

And next week we'll be talking about how we can integrate Go code with Swift so that

we...

I'm sure that exists.

We'll dig that up.

I'm sure it does.

It'll be, again, be the C API, I would imagine, that will lead to that.

So my next package for today is, well, it's one of those packages, you're going to laugh

at me here, that there are some packages that...

Like, for example, the one you just talked about and Wispakit and things like that, that

are big problems that solve fundamental issues that you might have in your application or

that allow you to do things that you would not otherwise be able to do in your application.

And some packages solve a very small and very targeted problem.

And my next package is one of those.

But then I love them just as much because that problem is just something that you're

going to have to solve over and over again.

And I'm going to talk about that in a bit.

But I'm going to talk about the other one.

And I'm going to talk about the other one.

we will see you in a few weeks

see you then bye bye