Practical AI

In this episode, Daniel and Chris are joined by Chris Aquino, software engineer at Thunderbird to hear the story of how they developed a privacy-preserving AI executive assistant. They discuss various design decisions including remote (but confidential) inference, local encryption, and model selection. Chris A. does an amazing job describing the journey from "let the big LLM do everything" to splitting apart the workflow to be handled by multiple models.

Featuring:

Chris Aquino – LinkedIn
Chris Benson – Website, LinkedIn, Bluesky, GitHub, X
Daniel Whitenack – Website, GitHub, X

Links:

Sponsors:

Shopify – The commerce platform trusted by millions. From idea to checkout, Shopify gives you everything you need to launch and scale your business—no matter your level of experience. Build beautiful storefronts, market with built-in AI tools, and tap into the platform powering 10% of all U.S. eCommerce.
Start your one-dollar trial at shopify.com/practicalai

Upcoming Events:

Join us at the Midwest AI Summit on November 13 in Indianapolis to hear world-class speakers share how they’ve scaled AI solutions. Don’t miss the AI Engineering Lounge, where you can sit down with experts for hands-on guidance. Reserve your spot today!
Register for upcoming webinars here!

Creators and Guests

Host

Chris Benson

Cohost @ Practical AI Podcast • AI / Autonomy Research Engineer @ Lockheed Martin

Host

Daniel Whitenack

Guest

Chris Aquino

What is Practical AI?

Making artificial intelligence practical, productive & accessible to everyone. Practical AI is a show in which technology professionals, business people, students, enthusiasts, and expert guests engage in lively discussions about Artificial Intelligence and related topics (Machine Learning, Deep Learning, Neural Networks, GANs, MLOps, AIOps, LLMs & more).

The focus is on productive implementations and real-world scenarios that are accessible to everyone. If you want to keep up with the latest advances in AI, while keeping one foot in the real world, then this is the show for you!

Jerod: 00:04

Welcome to the Practical AI podcast, where we break down the real world applications of artificial intelligence and how it's shaping the way we live, work, and create. Our goal is to help make AI technology practical, productive, and accessible to everyone. Whether you're a developer, business leader, or just curious about the tech behind the buzz, you're in the right place. Be sure to connect with us on LinkedIn, X, or Blue Sky to stay up to date with episode drops, behind the scenes content, and AI insights. You can learn more at practicalai.fm.

Jerod: 00:36

Now, onto the show.

Daniel Whitenack: 00:49

Welcome to the Practical AI podcast. This is Daniel Wightnack. I am CEO at Prediction Guard, and I'm joined as always by my cohost, Chris Benson, who is a principal AI research engineer at Lockheed Martin. How are you doing, Chris?

Chris Benson: 01:04

I am doing just fine. It's been a good day, good fall, and lots of cool things happening to talk about.

Daniel Whitenack: 01:12

Yeah. Yeah. I'm I'm excited to I'm excited for today's discussion. Although I have to say I I feel a bit outnumbered by the Chris's, but there's some cool Chris's on on the show today, including yourself, Chris Benson. But we've also got we've also got with us, Chris Aquino, who is a software engineer at Thunderbird.

Daniel Whitenack: 01:32

Welcome. We won't call you Chris b. Actually, your last name starts with an a, so maybe you're Chris a and Chris Benson is Chris b, and that just works out because his name starts with b.

Chris Aquino: 01:44

That's perfect.

Daniel Whitenack: 01:44

Yeah.

Chris Aquino: 01:46

Hello. Hello. Thank you for having me. I know that we had some rescheduling issues early on, we're here now.

Daniel Whitenack: 01:53

We're here now and glad we are, yeah, because we've had a few guests from Mozilla or projects that Mozilla has been involved with for some time or a couple times in the past and it's always great discussions and of course love the perspective that Mozilla brings but also projects like Thunderbird. Could you give us maybe just starting out a little bit about your personal background and kind of how that eventually led you into work on Thunderbird?

Chris Aquino: 02:24

Yeah, so my personal background, I've been a web developer since, oh my goodness, two decades. Let's go with two decades ago is when I started, and I worked for various companies that did different things. I've been, in addition to a web developer, I've done teaching and authoring. Most recently, well, to Thunderbird, I was at SurveyMonkey for a little while, and then the great layoffs of 2022 and 2023 hit. I was one of the newer engineers, so I got cut.

Chris Aquino: 03:01

And as I was applying for jobs, was like, you know, I've always wanted to work for Mozilla. Let's just let's just see what their job board looked like. And that's when I submitted an application and got word back from somebody who was clearly not a recruiter. The director of product emailed me, you know, just emailed me and said, Hey, I'd love to talk with you. And yeah, that was I initially got hired on to build just weird stuff that was outside of the realm of the Thunderbird desktop.

Chris Aquino: 03:33

He his name is Ryan, Ryan Sipes. He's always got something cooking. He had some interesting ideas for a set of products that he wanted to explore. And so he's like, yeah, could you would you be interested in joining us and and working on some of this of this weird stuff? And so I said yes.

Chris Benson: 03:54

As a quick follow-up on that, as you're talking a little just for the guests that may not be familiar with Mozilla and the context of Thunderbird within Mozilla, could you talk a little bit about that for a moment just to set the stage?

Chris Aquino: 04:07

Yes, I will do my best. It is a little convoluted, as Wikipedia can tell you. So Thunderbird, all right, you can think of Thunderbird Well, first of all, for those who don't know, for the kids out there who are like, What is email? Thunderbird is a desktop email client. And for a certain generation, those three words strung together means nothing.

Chris Aquino: 04:27

But it is a twenty plus year old open source project which originated at the company now known as Mozilla. If you're familiar with the Firefox browser, that is made by our sister company, the Mozilla Corporation. And Thunderbird and the Mozilla Corporation all fall within, I guess you could say, the guidance of the Mozilla Foundation, which is a nonprofit. And as of, oh, I guess it was five ish years ago, maybe a little bit longer than that, Mozilla was nobody was maintaining Thunderbird. So this application that was not Firefox had you know, they didn't want to deal with the maintenance that was you know, engineers.

Chris Aquino: 05:15

Engineers need to maintain this almost twenty year old C plus plus code base. And that was for, you know, management reasons. They're like, okay, we could just hand this to the community or we could just shut it down. And the director of product here at who's now the director of product, Ryan said, Wait a minute. Listen, give me a shot.

Chris Aquino: 05:38

Let try something out. Let me see if I can turn this into something more than it is now. And they did whatever legal things and paperwork was necessary to spin it off into its own entity. We are known as Thunderbird or the Thunderbird Project, but if you look us up on the internet, we're officially MZLA, which sometimes I feel like it's one of those acronyms that doesn't really mean anything. But that's what it says on my resume.

Chris Aquino: 06:05

So that's Thunderbird in a nutshell.

Daniel Whitenack: 06:08

Now, yeah, you mentioned this kind of history, this sort of desktop email, application history. Just for context, I'm sure there's many people listening that do remember those sorts of days, but but others that that might have just always used Gmail in a browser or something like that. But there's a whole generation of us that use something, whether it's Outlook, a lot of people have used. I remember installing Ubuntu for the first time earlier in my career and using Thunderbird in that context on my computer. Now you mentioned there's kind of maybe a different, a more vision for this now.

Daniel Whitenack: 06:55

Any context you could provide there in terms of the way people use email now versus those days when Thunderbird was being used as a desktop application. What is kind of the, I guess, the focus and transition, if you will, or at least some of the things that are being thought about?

Chris Aquino: 07:14

Yeah. I'll start with sort of the ethos, like why is Thunderbird still around? There are still people who use email. It works perfectly well. But unlike using, say, a webmail based provider, these folks are not interested in having ads sort of injected for them, and maybe they want to be able to opt in to AI.

Chris Aquino: 07:41

They don't want an AI just like ever present and reading their emails. So with Thunderbird, it is it's free and open source. No ads ever, forever. And you know, your email is your own. It gets downloaded to your computer.

Chris Aquino: 07:54

So if you lost internet access, you can still reach your email. If you lost power and your laptop is your laptop is open to Thunderbird, you've still got it, right? All those emails that you downloaded, they're right there. So I think that the way that email is being used, it's different. It's different now.

Chris Aquino: 08:14

One of the great things about Thunderbird back in the early days, and it's still true today, is it's very easy to manage multiple email accounts. That was, you know, I remember a time when like, oh, you have more than one email address? Amazing. You must be really important. And now you can just sign up for email addresses, dozens of them if you want, and, you know, have them each dedicated to a different purpose.

Chris Aquino: 08:41

You can still do that with Thunderbird. However, one of the things that people are encountering now, as we all are, is a certain amount of information overload, right? You're subscribed to so many newsletters and mailing lists and you you know, you're working on some collaborating on some side project and then you've got your your main work email. How do you how do you read? How do you read all that email?

Chris Aquino: 09:09

How do you deal with all of it? So those are the sorts of things that we're thinking about now is how do we empower the user to better deal with just this huge influx of information that they're getting every day, every hour?

Daniel Whitenack: 09:23

So first off, thank you for validating my multiple email use. Every once in a while, my wife gives me a hard time because I generate a new email every once in a while when I get frustrated with my email feed or have a specific purpose and it seems like no one knows, no one ever knows which email to email me at, which maybe is a strategy in and of itself. I'm not sure. So thanks for validating that. But yeah, I know that there's a good amount of kind of AI intersection with email, whether that be from the web email side, so in Gmail or other things with Gemini, or email clients that maybe are specifically geared towards AI features like a superhuman or these sorts of things.

Daniel Whitenack: 10:22

Not asking you to comment on every single one of those, but maybe in general, how from your perspective have you viewed this gradual integration with AI? And what sort of categories could people have in their mind of the kinds of ways that AI is being applied within email and some of those trade offs that maybe you're making when you're using those features that of course I know we'll talk about privacy in some of what we talk about with Thunderbird. But maybe just from a perspective, would be curious on how you categorize AI in email could mean a lot of things. Could you help people understand maybe some of that landscape?

Chris Aquino: 11:06

Sure. Yeah. I think that the two main ways that I see I mean, in my own Gmail account, because I will use a device that doesn't have Thunderbird on it because I am a longtime distro hopper, maybe I'm using a distro that doesn't have Thunderbird prepackaged, and I'm just trying things out. I have found that the automatic summarization is a thing that, you know, Gmail's like, Hey, Gemini can do this for you. Or all of the autocomplete that it tries to helpfully offer me, I feel like it's a little creepy sometimes, especially depending on what the email is.

Chris Aquino: 11:45

Like if I'm talking to my doctor over email and it's like, clearly this LLM has read this very private information, I'm like, Oh, how do I turn this off? But I do understand that for a lot of people, those two features, they are time savers. You have this way of compressing more human time into your day by offloading it to an LLM. So that's great. It's time saver and it's great that people have access to that.

Chris Aquino: 12:15

Some things that you lose out on that you're gaining time, but what you're trading are things like tone, right? The tone is for some summarization models, it will kind of strip out all the tone. Or if it's bulk summarizing multiple emails, that you know, email from your mom doesn't really sound like your mom. It is literally just like your mother's coming to town this weekend. So there's maybe dehumanization of email is kind of the wrong term or maybe a little extreme, but that gets essentially normalized to the tone of the LLM.

Chris Aquino: 12:56

And yeah, to your point, the privacy aspect, that's kind of the big one for us here at Thunderbird. We're very privacy respecting, privacy preserving, because that a lot of our users choose to use Thunderbird because of that. They want to manage and own their own email. They don't want they don't want their personal emails harvested for marketing purposes or for training data.

Sponsors: 13:40

Well, friends, when you're building and shipping AI products at scale, there's one constant, complexity. Yes. You're bringing the models, data pipelines, deployment infrastructure, and then someone says, let's turn this into a business. Cue the chaos. That's where Shopify steps in, whether you're spinning up a storefront for your AI powered app or launching a brand around the tools you built.

Sponsors: 14:03

Shopify is the commerce platform trusted by millions of businesses and 10% of all US ecommerce from names like Mattel, Gymshark to founders just like you. With literally hundreds of ready to use templates, powerful built in marketing tools, and AI that writes product descriptions for you, headlines, even polishes your product photography. Shopify doesn't just get you selling, it makes you look good doing it. And we love it. We use it here at Changelog.

Sponsors: 14:31

Check us out merch.changelog.com. That's our storefront, and it handles the heavy lifting too. Payments, inventory, returns, shipping,

Jerod: 14:41

shipping, even global logistics. It's like having an ops team built into your stack to help you sell. So if you're ready to sell, you are ready for Shopify. Sign up now for your $1 per month trial and start selling

Sponsors: 14:53

today heavy at shopify.com/practicalai. Again, that is shopify.com/practicalai.

Daniel Whitenack: 15:08

Well, Chris, we were starting to get into a little bit of, I guess, the intersection of the ethos of Thunderbird with these sorts of AI features. Now, could you help us understand, I guess, like we talked about autocomplete, we talked about summarization, for example. There are various mechanisms by which these features can be implemented in an email client or an application or web application, whatever that is, in terms of the actual AI model, where it sits, how the data flows, what the model is trained on maybe. Could you help us understand that piece? What are the buffet of options available to us in terms of how we might, like the integration point in the flow of data?

Chris Aquino: 16:00

Sure. Yeah, we have discussed at length different ways that we could approach this. So let me let me begin by saying that this experimental work in bringing an AI assistant to Thunderbird. This is not baked into Thunderbird. We're not going to turn it on for users.

Chris Aquino: 16:21

It's not going to be automatic or anything like that. Instead, what we've done is we have built it as I'm going to call it sort of like companion for right now. We'll put a pin in that. We'll return to that because there's a lot of decisions we had to make because of that approach. Now, our options that were available to us, like, we could just do like, add model inference to Thunderbird itself.

Chris Aquino: 16:50

This is, in my opinion, like, yeah, but now this is us turning it on for all users. It's like, just add the model in there and allow us to do inference locally. It's private, right? And that's great. However, we're not so invested in this idea that we want to put that on the desktop team's roadmap.

Chris Aquino: 17:10

When I say the desktop team, just a little plug for the mobile team at Thunderbird. There you can grab it for Android now, works really well. IOS is coming soon. But as far as the the desktop client goes, like, they already have their road map and we just kinda wanna run our AI experiment parallel to that. Okay, so that's one option.

Chris Aquino: 17:35

A second option is like, well, could we run inference in a separate application also locally? Yes, we can. And we've kind of poked around with that, but we don't necessarily if we want to roll this out, we don't necessarily want to require users to download a second application. Okay. So well, how do we split the difference here?

Chris Aquino: 17:58

Alright. So what if we what if we called out to an API, like one of the cloud providers for, you know, any of the the models that would be good at helping you out with email, the the the typical tasks of summarization or reply generation? So we decided that could be a thing because it doesn't require installing anything, you know, heavyweight and additional, but that brings up a separate problem of like, where where are you sending the email data to? And I'm happy I will be talking more at length about that.

Daniel Whitenack: 18:31

Yeah. I think in all my discussions in my day job work, that's often what it comes down to is we would love your AI features, where is the data going? So I definitely understand that. And I assume that there's definitely this tension of doing things locally and putting that on, like you said, the desktop roadmap. But also there would potentially be kind of either limitations in terms of the kind of model you could use, or even just if you could use a model that worked well, but it might just destroy all of the battery of the device on which it's running or that sort of thing.

Daniel Whitenack: 19:22

Were those things also part of that conversation?

Chris Aquino: 19:25

They definitely were, especially regarding the second application that just ran a model inference process in the background. I don't know about you, but I my laptop, my work laptop, not super fancy. It generates, you know, a few tokens per second, which is it's just not fast enough for this. I mean, for me to work on the thing that I'm working on. So that was a big concern.

Chris Aquino: 19:54

That and, you know, a lot of our users are on Linux and they're running Linux because maybe they're they're continuing to use their perfectly good hardware from ten years ago, which most certainly cannot do any sort of local model anything. But yeah, laptop users, I love my battery life. I don't want to completely destroy it by trying to summarize a batch of today's emails.

Daniel Whitenack: 20:21

Yeah, yeah, that makes sense. So you mentioned this idea of using a model that is behind a remote API. There's obviously a selection. We're kind of narrowing in. There's a sort of selection of ways that that could happen.

Daniel Whitenack: 20:39

And there's like various approaches around maintaining privacy there from just using an API that explicitly doesn't store certain data, at least according to their terms, or doesn't train on your data, at least according to their terms. There's also kind of, I know people are exploring kind of homomorphic encryption and all of these things to keep data private. Then there's sort of end to end encryption and there's all sorts of ways that you could think about privacy in that context. What were the main, I guess, the main pillars of what were important for you all to consider? Was that where data is stored at rest?

Daniel Whitenack: 21:28

Is it the openness of models or whether you were hosting those or a third party was hosting those? Was it Yeah, what were kind of the main topics that came up once you kind of dipped into that remote inference side of things?

Chris Aquino: 21:46

That is a great question. And I'm going to try to condense the story down.

Daniel Whitenack: 21:51

Okay, great.

Chris Aquino: 21:52

It'll be kind of like, you know, The Hobbit and then the three parts of the rest of the story.

Daniel Whitenack: 21:58

Sure.

Chris Aquino: 21:59

Wow, you should just totally, like, revoke my nerd card now for not coming up off the top

Daniel Whitenack: 22:05

of Tell me my ring. That's the one. Exactly. I

Chris Aquino: 22:09

just lost half of your listeners.

Daniel Whitenack: 22:10

The Return of the King. Yeah, you got it. I at least, I might not know about the cool encryption stuff that you're about to talk about or whatever that is, but at least I have that one.

Chris Aquino: 22:26

Nice. Well, our powers combined.

Daniel Whitenack: 22:29

The powers combined, it takes a community. Yes, yes.

Chris Aquino: 22:34

So let me start this little story off with the very earliest experiment with this was as a Thunderbird add on. Okay. What is Thunderbird add on? A Thunderbird add on, you're familiar with and your listeners are probably familiar with browser extensions. Okay.

Chris Aquino: 22:55

So fun fact, Thunderbird under the hood uses the Gecko rendering, the Gecko engine from Firefox. So we have access to APIs that can make it possible so that something almost exactly like a browser extension can reside within Thunderbird. And if you're not familiar with browser extension development, it's basically like HTML, CSS, and JavaScript. So we started by writing something that was It was the most 1990s looking web page that was just sort of like jammed into an add on and just displayed in a new tab in Thunderbird. And it was I mean, you know, it looked like an engineer built it and that's totally fine.

Chris Aquino: 23:47

But yeah, that's when we started with calling like we started off with OpenAI's API and just handing off a number of emails to like I think it was chat GPT-four. And that did an okay job. But that was like, okay, it does work. How do we build this out? And then we started trying to get better results with some prompts tuning and whatnot.

Chris Aquino: 24:12

And then for as we started trying to use it with more people within Thunderbird, we found out like these people, their emails are sensitive. Like we don't what do we we need to do something about this. So we started shopping around for some sort of cloud based provider that could give us a guarantee of, yes, we do not store your data. No, we're not using it for training. And we talked to we were in contact with a couple of different companies, some of whom just sort of sent us to like one of the pages on their website which told us nothing, useful, couldn't give us a good guarantee.

Chris Aquino: 24:51

So that's when we started talking to the folks at Flower Labs. And I know that you have had several guests from Flower on the show. I just want to say that they are so terrific to work with. They took care of I mean, really? They took care of all of our needs.

Chris Aquino: 25:12

They moved things around on their own development roadmap and gave us early access to things like end to end encryption and access to their newest product, Flower Intelligence, which it is so for the listeners who hadn't heard of Flower or listened to previous episodes, they're known for their federated learning SDK, right? They build software that sort of does learning on individual nodes and then shares the learnings with a centralized server. All very cool stuff. We didn't need that, though. We we needed private API or rather API access to a private LLM.

Chris Aquino: 26:00

What we got in addition to that is we got a nice SDK in TypeScript, and we also got so we got the end to end encryption and oh, right. They found a model for us, and then they they did some post training on email summarization. They built an eval system so they could, like, fine tune. They help produce prompts through the eval system. I mean, they've been incredible.

Chris Benson: 26:28

I'm just curious as you talk about this, and especially having had multiple folks from Flower on the show in the past, You're talking a little bit about kind of how you got into the collaboration, but like, how did you how did Flower come into the picture to begin with for you guys? You know, how was that connection made and and and how as you were looking at that connection potential, how did you know that that was a good fit for this new strategy that you've been laying out?

Chris Aquino: 26:57

I have the most boring yet magical answer to that question. It just kind of fell into our laps because Mozilla alright. Remember I was talking about the nonprofit Mozilla Foundation under which Thunderbird sits? They're an investor in Flower. And so Mark Sermon connected Ryan, my boss's boss's boss, with Daniel from Flower.

Chris Aquino: 27:24

And I just ended up on a Zoom with him one day. Ryan introduced us and said, All right, take it away. And that's when the collaboration began. And so that's the thing. I I guess having a human face to go with the company made me feel good about those guarantees of like, no.

Chris Aquino: 27:40

We're we're literally not in the business of harvesting data. Like, we we're gonna set up this infrastructure, we literally can't help you debug your prompt because it's encrypted now. So that was the fact that they they've been so they were so helpful at every step except when I sent some bad data. Like, the fact that they were like, we really wanna help you, but we built it in a way that we can't see it, the data. So it was a fortuitous connection, thanks to the reach of the Mozilla Foundation.

Daniel Whitenack: 28:13

Well, Chris, you started kind of unveiling some of what made this particular route of experimentation useful for you all. Could you help us understand maybe just at a slightly deeper level, like I could, let's say, spin up a model in VLM in a VM on GCP or wherever I host things and then connect to that over an API. What kind of makes the hosting of the private model within the Flowr system? Because it's, like you say, it's not federated, but there is still kind of more there. You mentioned, of course, there's the post training that you talked about.

Daniel Whitenack: 29:00

But as far as the inference side, could you help us understand that a little bit more?

Chris Aquino: 29:04

Yes. So as I mentioned, they've taken care of all of that, which is That is the big benefit. They've really been a great technical partner while we conducted these experiments. And from the privacy aspect, my coworkers no longer have to prune their inbox and remove anything sensitive before trying out what we, at this point, we have dubbed Thunderbird Assist. It is your personal executive assistant within Thunderbird.

Chris Aquino: 29:35

That the idea anyway. Thanks to the guarantees made by Flour, we were then able to try out I mean, we didn't land on the current model immediately. We tried different BERTS, different Barts, Roberta, all the various summarization models. And then one day, the summaries got way better. And I said, What did you do?

Chris Aquino: 30:04

What is this magic? They switched to one of the llamas from Meta that was trained for conversation. And it just it worked better for email content. So at that point, we stopped thinking about the prompt because they had squared that away. And prior to that, they found a model that could do the task very well.

Chris Aquino: 30:28

So then that got to the part where we're like, okay, so it's really great at summarization and reply generation. The third feature that we worked on, you know, this is the the biggest thing we aimed for. Because when you think about, like, oh, hey, you're doing email summarization and reply generation. That's great. That's basically the hello world of, you know, LLMs, summarize this text.

Chris Aquino: 30:53

So we started working on something that was or we had been working on this feature that was it it did not work well. It was we refer to it as the daily brief. And it was intended to be an executive summary of your recent emails. This is when I learned the definition of overprompting. What I would do is I would take however many emails arrived in the last twenty four to forty eight hours, ship it off to the model, and then ask it to do, oh, can you find the most important messages, extract all of the highlights and any action items and return those back to me grouped in this particular way with links back to the original emails.

Chris Aquino: 31:43

I mean, the garbage that I got back sometimes was epic. So I then learned that, okay, what I need to do is I need to split this up into multiple multiple requests. Right? Let's like let's send a batch and only ask for importance. Some of your listeners are like, okay.

Chris Aquino: 32:02

That's highly subjective, and I will return to that momentarily. But when we provided the emails, we also provided the unique message IDs from Thunderbird. So that way I could then use that as an index, grab the original emails again and send off a second request, which was like, okay, so for these, I want you to find the action items and then take the same batch. Now ask for highlights and crucial information. The formatting task never worked well, so I have lots of feelings about that, so we'll put a pin in that.

Chris Aquino: 32:41

The formatting never turned out because that was what I wanted as an application developer that LLM was not capable of, right? It's a statistical model, and I guess it's thinking that maybe they want the header bolded sometimes and maybe they don't. I'm like, LLM, do what I want. And it's like, who knows? I'll roll the dice.

Chris Aquino: 33:03

Maybe I'll give it to you the way that you want. And so was an important lesson was like, okay, so in this currently, you really need to be very careful, specific, and constrained in what you ask the model for. And then the problem turned into, okay, so how long does it take to make these subsequent requests? What can we parallelize? How do we do that effectively without burning up all of Flowr's compute in their infrastructure?

Chris Aquino: 33:39

So we started looking for ways to optimize that and we switched to a local Bayesian classifier. So we'll take the first of several tasks and we'll just do that locally. So instead of asking an LLM to very subjectively decide what sounds important or what the cosine similarity algorithm tells them is important, we'll do that locally. We'll let the user use the Thunderbird feature of tagging emails as priority one, priority two, whatever. So for our experimentation, we had each user tag a handful of messages as highest importance and tag a handful as least important.

Chris Aquino: 34:24

And then the local Bayesian classifier that we just included as a JavaScript library in the Thunderbird add on works very quickly, even for lots and lots and lots of messages. And so, okay, task number one, taken off the plate of the LLM. And so now we just have it do the rest of the tasks. And likewise, the formatting task, we just handle that ourselves. A quick note about formatting.

Chris Aquino: 34:51

For a time when we were using one of the other third party cloud providers, we found that you could provide them a JSON schema that the model would conform to when giving you the response. That was a magical time for me as an application developer because it's like, oh yeah, give me the JSON. I will just put it through my framework and it's just going to render the things beautifully. Like, look at my CSS applied so perfectly to this. This is amazing.

Chris Aquino: 35:22

As we were model hopping and provider hopping, that kind of went away. And we haven't returned back to it because at some point we realized that, okay, so learning our lesson from before about splitting up tasks, we realized that we need to take a different approach for the daily brief. And this is when I got it in my head that like, okay, so the future of this feature is not to just keep on sequentially prompting the same language model for like, hey, now do this, now do this. Instead, I think of like, you know, you go into a professional chef's kitchen and you don't see like this one giant tool that can slice, dice, and microwave and air fry on top of that. You see lots of little dedicated tools that are like in expert hands, it does that one thing and it's going to do it the best.

Chris Aquino: 36:18

So one of the things that we've got that I have written down in my mad science notebook is to explore like, well, what if we could dedicate some small models to specific tasks and then coordinate them in some sort of much more deterministic way? So the daily brief currently, all that to say, the daily brief currently has been kind of sidelined. And you know, we're like, okay, so shipping assist means more or less summarization and reply generation. And daily brief, we still need to work on that because again, the the approach needs to be more more granular and more more deterministic.

Chris Benson: 36:59

I'm kinda curious. As you've taken us through that process, one of the things on my mind is how different kind of, you know, I don't for lack of a better word, audiences within your customer base, different profiles, you know, how these different approaches that you've taken as you guys have devised the strategy forward on Thunderbird. Where are you seeing more uptake? Where are you seeing, you know, people like, know, we I think in early in the conversation, we talked about younger generation who may not have grown up with email like we did. And then on the on the far side of that, you have kind of the corporate world and stuff like that and with a with a, you know, a certain segment of your of your user base in different aspects.

Chris Benson: 37:43

I'm just curious how that may have how people are receiving this in those different capacities, given the fact that you have different interests and stuff.

Chris Aquino: 37:52

Yeah. Yeah. Well, the the short answer is that our users are fairly homogenous at this point. Our users of Thunderbird Assist are very homogenous. They are all Thunderbird employees.

Chris Aquino: 38:04

Gotcha. Because That helps. It is Yeah, it does. It does. And even though this work is it is open source, it's on GitHub right now.

Chris Aquino: 38:13

We haven't released it, you know, for general use because, again, Flower had been tuning the models and making changes to their infrastructure. So they weren't ready to receive, like, a lot of users from all over the world. So within Thunderbird, there are even different needs. Some people use the individual email summary feature. Okay, so let me back up.

Chris Aquino: 38:43

There three features that are available in Thunderbird Assist. The first of which, individual email summarization. And if you aim it at like a quoted thread, you could call it a thread summarizer. There's email reply generation, and then the third one is the daily brief. Now for the different kinds of users fall into two camps.

Chris Aquino: 39:08

There are the I would like you know, this long thread from the Thunderbird mailing list, I need that summarized because, wow, that is too long. The other kind of user is the one who gets way too many emails and needs an executive assistant. And that was where, as I just mentioned, it's like, okay, so that feature is just not going to work very well given the approach that we've taken. But based on what some of the users have requested in response to using Thunderbird Assist, that's given us some ideas of like, well, what we should be focusing on is we need things like semantic search. Or because it is a Thunderbird add on and has access to more than email, like it has access to your calendar account that's in Thunderbird.

Chris Aquino: 40:05

Thunderbird also does task management and it even pulls in RSS feeds, which RSS, it's coming back so strong. I love RSS. I think that this idea of like, okay, so if we could correlate between these different pools of information, that could be extremely useful to some users, which brings us back to the whole like, Okay, well, need small dedicated models for each kind of data because they're going to be formatted very differently.

Daniel Whitenack: 40:38

Well, Chris, I have sort of two well, an observation and then maybe a clarification. So number one, I love how you described this progression from kind of the one tool to accomplish every task, which is often how people do think about using these models down to splitting this out into maybe it could be different models, it could just be different applications of the same model, but that are segmented or these sorts of things. This is so often what I recommend to people. It's kind of like when you have a junior developer and they come to you and they're like, I wrote all the functionality. It's all in this one function, thousands of lines of code.

Daniel Whitenack: 41:24

You're like, Okay, we need to split this up. In this AI world, there's that need for that splitting up and of course it makes things more testable and all of that as well. So thanks for highlighting that. I think that's a really, really practical and good point. I think the clarification, just wanna make sure that people picked up on, you kind of referenced some of this work with Flowr and we talked about that remote inference.

Daniel Whitenack: 41:52

If I'm understanding right, because you're running a local application, the data that flows to that remote inference is encrypted on the device. So it's encrypted in transit. And then if I'm understanding Flowr's implementation, you can correct me if I'm wrong, that would only be decrypted sort of in a confidential enclave in the inference infrastructure. So that's when you say like even Flowr, even if this is running in their infrastructure, they would not be able to tell you what a prompt is. Did I pick up on that somewhat in the right vein?

Chris Aquino: 42:29

Absolutely, that is totally correct. It is for any web developers or or, you know, anybody who has had to write software that interacts with an API, you're probably communicating over something called HTTPS, which is sort of like, that is that is the baseline amount of encryption that we want. It's gonna encrypt the traffic between your browser, the client, and then the server. They take it a step further. There is there's a I would call it a three part process for making sure that your your data is protected.

Chris Aquino: 43:05

So first off, let's say you're logged in. Right? You you log in to your your Thunderbird account, which we created specifically for Assist and some other services, which I will talk about a bit later, then you are issued an API authentication token from Flower itself. Right? You're logged in.

Chris Aquino: 43:26

You're now going to talk to Flower. Flower's like, yep, you logged in through Thunderbird. Cool. Here's your authentication token. Use this now to exchange public keys with yet another server, and that server does nothing besides run the, you know, run the language model.

Chris Aquino: 43:47

And at that point, as you as you observed that any traffic between your client between Thunderbird Assist specifically, and then the machine running the model, it's all encrypted in between. So you have yeah, you're double protected, I guess, HTTPS plus the public key encryption between you and the model server.

Daniel Whitenack: 44:14

That's great. Yeah. And I think this is a great way to maybe expand people's thought process around what's possible with privacy and LLMs and how that can be split up between like where the LLM is running, whether that's local or not or both. So yeah, appreciate you going into a few of those details. I think it's really helpful.

Daniel Whitenack: 44:37

As we do get closer to the end here, I would love to maybe just kind of ask you to close us out by thinking about the future now that you've run these experiments. You've kind of gone through this process. I love how we kind of went through this kind of story of how this developed. That you've gone through that process, as you look towards the future, what excites you about where things are at now and where they're headed this project or maybe in terms of like the wider ecosystem that you're now a part of using this tooling around kind of remote confidential inference and that sort of thing?

Chris Aquino: 45:23

Yeah, there are a lot of exciting directions that we could take this work. Again, this was sort of an initial experiment, but we are planning on shipping this with what we're calling Thunderbird Pro, which is a suite of services. Like my other web developer teammates, they're working on other things like a application, a web based scheduling application. There's an end to end encrypted file sending application. And there's ThunderMail.

Chris Aquino: 45:57

I'm just going to say that again. ThunderMail, which is our very own email service. Okay, so one of the things that could be very interesting and perhaps even take advantage of federated learning, thanks Flower, is if you could treat the server, the email server or, you know, another machine that is co located with the server as another client, right, that has access to your encrypted email that's on ThunderMail that, you know, while you're asleep or while you were disconnected from the Internet, it could be creating embeddings or doing some other inference based on your email data and then transmitting the learnings to your local machine. That Imagine if you could do semantic search without having to generate the embeddings on your laptop. And you can do it in offline way because as far as you're concerned, the embeddings are effectively pre generated and downloaded along with the messages themselves.

Chris Aquino: 47:04

The trick there, of course, is doing it in a way that will satisfy, you know, the most staunch privacy advocates. They're like, wait a minute, if you have a server that's in your infrastructure and has access to my email, then it's really not end to end it's not encrypted. So we need to figure out a good solution to that before we can explore that. But some other things that I alluded to earlier involve expanding that's such an overloaded word expanding the context that the model has access to. And I don't mean like, you know, context window.

Chris Aquino: 47:37

I mean like, okay, so giving it access to your calendar, to your to dos, your RSS feed. What if we added a notes application to Thunderbird and then effectively turning making it possible so that Thunderbird could be used as an LLM assisted personal knowledge management and communication tool. That whatever future that looks like, that's that's more exciting to me personally. I'm one of those people who I have notes from the last handful of like, couple of decades that I still keep around, and I would love an LLM to help me sift through that. It would be even more interesting that as I'm making a note, it could suggest related documents and ideas that I've had in the past.

Chris Aquino: 48:29

Or just I mean, for a lot of users, just helping them stay organized. Because again, there's so much for one tiny human brain to keep track of, and there's just so much information. So I think that for me as not as, you know, an ML researcher or an AI expert, I'm just application developer. I want to work on that. I want to build that and make it possible for people to have more control over their information, help them retain their privacy, but, you know, make those creative connections that only they as a human can do.

Chris Aquino: 49:06

But an LLM, local or confidential remote compute assisted, reminding you of like, oh, you wrote this, here are some things that you've written about that are related to that. Or here's some conversations you had in email or in chat. And then for you, the user, as just a regular squishy brained human, you're like, oh, I just had this weird random flash of insight based on this constellation of information that I generated over years. I think I like that future of AI. And also as an application developer, I think that I really want LLMs to be more deterministic.

Chris Aquino: 49:51

Like, it's so weird to call an API with the same data and get very different results. And we can get into this or not, but I definitely feel like chat is the wrong interface for a lot of tasks.

Daniel Whitenack: 50:06

Yes, thank you.

Chris Aquino: 50:07

Okay, cool. I just want to make sure I'm in good company. So I've got I don't know, again, in my little mad science notebook, I've got ideas around like, okay, how you swap in deterministic functions? How do you coordinate the efforts? And I think something beyond I mean, maybe I'm describing a more strict version of MCP, but the fact that your input and your output currently is plain language is it's a double edged sword, because the only way to determine if you've got a bad result is for you as a human to evaluate it.

Chris Aquino: 50:48

Unless you spin up another language model to verify the first result. But as a programmer, that feels a lot to me like, Oh, I just wrote a function. And the only way to know that if my function call was correct is to write another function to check it. Yes. And it just feels like it just feels wrong.

Chris Aquino: 51:06

So I really yeah, I want discrete inputs and outputs. I want language models that are small and dedicated to specific tasks. And then want reusable, shareable ways of wiring them together. I want to create essentially workflows of information processing within Thunderbird. So that's me.

Chris Aquino: 51:35

I'm the personal knowledge management cheerleader at Thunderbird. That's my new title.

Daniel Whitenack: 51:41

That's awesome. Well, that future is one that I could get on board with for sure after struggling with a lot of the things that you mentioned as well and also hoping for many of those things. So yeah, thank you so much for sharing this journey and this experimentation that you've been on with Thunderbird, please keep up the good work. Give our thanks to the team for inspiring us with a lot of amazing work, and thanks for sharing your insights here with us. Appreciate you taking time.

Chris Aquino: 52:14

Yeah, thank you so much for having me. I'm really, really glad that we could make this happen.

Daniel Whitenack: 52:20

Us too. We'll see you soon.

Chris Aquino: 52:21

All right, thanks.

Jerod: 52:30

All right. That's our show for this week. If you haven't checked out our website, head to practicalai.fm, and be sure to connect with us on LinkedIn, X, or Blue Sky. You'll see us posting insights related to the latest AI developments, and we would love for you to join the conversation. Thanks to our partner Prediction Guard for providing operational support for the show.

Jerod: 52:50

Check them out at predictionguard.com. Also, thanks to Breakmaster Cylinder for the Beats and to you for listening. That's all for now, but you'll hear from us again next week.

More episodes

Chapters

Creators and Guests

What is Practical AI?