Explore the evolving world of application delivery and security. Each episode will dive into technologies shaping the future of operations, analyze emerging trends, and discuss the impacts of innovations on the tech stack.
Lori MacVittie (00:04.514)
Welcome to Pop Goes the Stack, where emerging tech makes bold promises and your infrastructure quietly starts side-eyeing the cloud bill. I'm Lori MacVittie, here to ask why everything new suddenly looks suspiciously like something we used to run in a server room with questionable air conditioning. So we've got Joel "OpenClaw" Moses back. Welcome back Joel.
Joel Moses (00:27.623)
Hi, Lori. How are you?
Lori MacVittie
Good to see you. Yeah. And so...
Lori MacVittie (00:33.664)
It turns out that just throw it in the cloud hits a wall when the "it" is your meetings, your IP, and your entire operating context. Go figure. I know it's shocking. But enterprises are finally doing the math and realizing that if AI lives somewhere else, so does everything it hears, reads, and remembers. Now that's a problem when your AI is effectively a coworker sitting in every meeting and quietly hoovering up institutional knowledge.
Meanwhile, local compute has grown up. You can run real models, real workflows, even agents without shipping your data off to who knows where, which begs the obvious question, why are we still doing that? I don't know. So...
Joel Moses (01:16.739)
It's easier, Lori, it's easier.
Lori MacVittie
It's easier. I know it's a button. Well, we're going to talk about options so that, you know, people know there are options out there. And to do that, we brought on the founder and CEO of
Lori MacVittie (01:30.712)
Quill Meetings, Michael Daugherty. Welcome Michael.
Michael Daugherty (Quill) (01:34.501)
Thank you, Lori. It's great to be here.
Lori MacVittie (01:37.03)
Awesome. We're glad you came. We really are. Especially if you've watched an episode before, we're just grateful you said yes. So you are, I hear, betting that the future of AI isn't to trust the cloud; it's don't leave the building. You know, an AI Chief of Staff keeps your data private. It just doesn't let it leave. So, let's talk about what you do, what your solution is, and why you felt the need to build it in the first place.
Michael Daugherty (Quill) (02:06.908)
So again, thanks. Very exciting to be here. When we were starting Quill, one of the things we were looking at is we were thinking about local first for actually a very different reason. One was it was just because there's so much information. If I want to make AI really custom to me, there's a lot of information that I hold very close to my chest. It may be only on my computer.
And we were thinking about things like iMessage, the context that is in there that you can't get through an API, and as well as other sources. And then when we, but this was before we founded the company, so we were really like exploring different ideas. But the thing we found most interesting was when we started running local transcription and combining the transcripts that we were building up with an LLM to actually help make custom decisions that are unique to me coming out of it.
Because what we saw at the time was there were a few bots that would join your meeting and primarily they would just transcribe. First of all, if a bot shows up to a meeting that you're not in or that you're late to, it's super awkward. And so we were seeing founders meeting with VCs, nobody was using this because it's just a very awkward thing to ask someone to do. But secondly, the outputs were all focused on being very viral.
And so they were lowest common denominator. Again, let's use the VC founder example. The VC is going to actually have very different takeaways than the founder. The founder is going to want to tell their co-founder who they met with,
Joel Moses (03:47.398)
Right.
Michael Daugherty (Quill)
update their list of investors, say whether or not they liked them. The VC is going to either write a rejection email or they're going to email their team, update a CRM. Everyone actually has different takeaways. And so this all comes to this idea that
Michael Daugherty (Quill) (04:02.694)
then why do they all have the same database? Why do they all have the same exact behavior? And why is everything getting emailed to everyone right after the meeting? And so when we started building, we immediately found this very useful. One of my first tests, I will say, again before the app was really an app, but testing to see if it would work, is I had gone to the ER for just a small heart thing. An arrhythmia and, you know, I was a little nervous about it.
So after it, I had to a follow-up and my wife also wanted to learn a bit more about it. But she's a native Chinese speaker. And it was something important. She knew a doctor at one of the state hospitals over in China. And so she decided to call him up and ask him for his perspective so she could also get a lot of, you know, have someone she trusted. And so we record the conversation and then transcribed it locally. And this was in Sichuanese.
And it transcribed and then in combination with this local transcript we now had and the LLM, we were able to actually get out like a really good takeaway, which had questions I should ask my doctor when I go to the follow-up.
Joel Moses (05:14.734)
Wow.
Michael Daugherty (Quill)
And this conversation just taught me that, wow, this is really going to work because it actually was super useful. It wasn't just a transcript I put somewhere and never look at again, because it's way too long and because it's Chinese and very hard for me to read.
Michael Daugherty (Quill) (05:28.508)
But also the transcription actually worked, even in multiple languages and locally.
Joel Moses (05:32.942)
And of course, this is information that you wouldn't necessarily want to, it's
Michael Daugherty (Quill) (05:38.265)
Exactly.
Joel Moses
health information. You don't want that going to a cloud environment. You want to keep that as personal as you can.
Lori MacVittie (05:41.57)
Right, that's very personal.
Michael Daugherty (Quill) (05:42.607)
Exactly. So that's where Quill started. And we found this resonating with a lot of people. And at the beginning, we were very much individual first. Our philosophy is make the individual as powerful and as in control of their data and their AI as possible. But we did find this resonates a lot with teams and enterprises, particularly in regulated spaces. Anything from schools, we have doctors back offices. We have
Registered Investment Advisors, pension plans, and aerospace defense companies now as well, that use Quill. And they also like the control. And it may not all be locally on an individual's laptop at that point, but we let them set up their own inference servers and their VPN, decide where things go, how things get stored, et cetera.
And so I think that's probably a lot of the future of Quill is making sure we work for the individuals, we work for the teams and essentially give them that power over how they're using AI.
Joel Moses (06:43.27)
That's really interesting. Lori, do you remember when mute yourself was the biggest risk in online meetings?
Lori MacVittie (06:49.42)
Yes.
Joel Moses
And now it's who else is listening algorithmically and where is all that going?
Lori MacVittie
Who is listening?
Joel Moses
And that's kind of an interesting transition. Now, I imagine doing all of this processing locally places a lot of dependence on the device or
Joel Moses (07:05.522)
the laptop that someone is carrying.
Michael Daugherty (Quill) (07:06.917)
Mm-hmm.
Joel Moses
And in some ways, you may run into a scenario where someone is on a lower quality device and the transcription may not be as fast as it is on like a modern AI PC from Intel or on a Mac platform with some of the assist built into it. So it's very much kind of a sliding scale, right? So you have to figure out and profile what you can even do with the information locally, right?
Michael Daugherty (Quill) (07:34.236)
That's definitely true. It goes back to your question about partly why is so much stuff in the cloud. I'd definitely say as a developer that part of the reason is that it's much easier to develop if you can send everything to Deepgram.
Joel Moses (07:50.08)
Sure.
Michael Daugherty (Quill)
They already do a great transcription and you just have to send all your data there and then they send it back. And that's appropriate sometimes. But when we're developing locally, we do constantly find, here's somebody with a Windows device, old driver, a new GPU and all these idiosyncratic setups.
And so we're constantly working on this. Our main philosophy is that as we think through the AI pipelines, we make sure every step of it can fall back. And so each step of say detecting when there's someone speaking at all that we have two different models we use for that and the fallback where you may not even need to run one.
For transcribing by default, we'll use Whisper, but we'll detect on your device, we'll do some tests and see what size of the Whisper model are you able to use.
Joel Moses (08:47.17)
Right.
Michael Daugherty (Quill)
We will also detect as it's running, whether your device is being stressed. We will change from running continuously to running perhaps even at the extreme end, we'll just save the audio, do it later after the meeting is over so that we don't interfere with your meeting.
We kind of think of Quill as the second most important thing running on your computer. If you are, for example, in your Zoom meeting, the number one thing you want to do is have a Zoom meeting. So we need to make sure we don't interfere with that. And we always back off as much as we can. And so as you go down the steps, every step of that we try and measure and make sure that
Michael Daugherty (Quill) (09:27.307)
users either can choose what models to be running at that step and or we have ways to replace the model but keep the steps going.
Joel Moses (09:38.138)
That's interesting.
Lori MacVittie (09:38.349)
I liked that you brought up that basically if I'm in this organization, like we could have our TS/IT, right the people who manage infrastructure, set up an inference server that could do the transcription. And then you could use that, which takes the onus off of the individual users and making sure their hardware is okay. And it puts control back into the companies
Lori MacVittie (10:06.221)
right, sphere, if you will,
Joel Moses (10:08.005)
Right.
Lori MacVittie
because now they have it there. So I like that option. I think that needs to be an option for more things actually, right? To make that easier, because you're right, too often everything ends up in the cloud because it's just easier to call an API darn it. And that's true, right? They're the ones that set it all up and manage the authentication and handle all the web stuff. They do it all. It's great, but it's not always optimal for the user or the company.
Joel Moses (10:37.89)
That's right. No, please go ahead.
Michael Daugherty (Quill) (10:39.227)
Something I think, oh, sorry.
Joel Moses
No, please go ahead.
Michael Daugherty (Quill)
Something I've thought about is that I would feel really good if the application I build can survive beyond the company. You know, cause I think about like the software we used to get on our computers back in the early 2000s. And yet everything goes away when the company goes away because now you're not, you don't have that cloud anymore.
Joel Moses (11:04.144)
Right. Right.
Lori MacVittie (11:06.493)
like video games
Joel Moses
Yeah.
Michael Daugherty (Quill) (11:08.143)
Exactly.
Lori MacVittie
and digital content and yeah.
Joel Moses
Yeah, you know, it's interesting you mentioned that, Michael. I was thinking about exactly what a local first utility like this really means. And I think what it means, for the most part to me, is that what you're doing is assisting users in developing a local lexicon of information and exchanges that they use.
Joel Moses (11:29.862)
And it allows you the capability to kind of personalize your reaction to the lexicon as you experience it. So, for example, if I'm in a Zoom meeting and I'm running Quill in the background, I can hit a keystroke if something occurs in the meeting that I find particularly worthy of highlighting.
Now I can't do that with the built-in transcription utilities for zoom. It's recording the meeting for all attendees. But if I find something that is personally interesting to me, I can highlight that. And those highlights then roll back into an area of focus within Quill, which is unique to me. Right.
And so when I take that information and I ask for transcriptions to be made, it knows where to focus because I told it where to focus. And I think that that's a really interesting idea.
Michael Daugherty (Quill) (12:17.807)
That's right. And I suppose it comes from the same sort of philosophy of each individual user may want to do something different with an individual meeting or with their whole corpus. And so yeah, during the meeting, we have a floating overlay and you can open up a notepad and take your own private notes. These are not shared with anybody else. And you can use the keyboard shortcut to jot them down. You can take a screenshot at a certain point, or you can just type something in.
So if, I find maybe three or four times during a meeting there's often something that I'm thinking that wouldn't make sense to say right now. But it's like, oh, this made me remember I also have to do x, y, z after the meeting, this sort of thing. And so I can just write a few words and then afterward, Quill will try to analyze what you wrote down, why, and at what point, and give itself some notes.
And then as you generate the outputs, for example, maybe I want to generate a follow-up email to somebody, it will take those private notes and they influence the output. They don't necessarily get written in the output direct as quotes to share with somebody else, but it influences how it's writing that output email because maybe I've said, "I really like this guy, I'm going to hire him."
And so it knows something about what he said right then is something we should put in the output.
Joel Moses (13:39.163)
Yeah, that's really fascinating. The idea that the transcription and the lexicon of information that you're building up actually has some personalization related to it that is private, that is unique to you, but you can choose to let it influence the outcome or you can use it as a way to highlight or dig deeper.
Michael Daugherty (Quill) (13:55.643)
Mm-hmm.
Joel Moses (13:56.04)
I also love the idea of the lexicon also being accessible to other tools. And you guys have spent a lot of time ensuring that if you do want to use or do want to send things through LLMs or have LLMs use Quill as data source, you've done a lot of work with MCP. You're both a server and a client, right?
Michael Daugherty (Quill) (14:15.352)
Yeah, and they serve different purposes. The server is a local server that you can optionally turn on and connect it to Claude Code, connect it to Cursor. With Claude, we actually also have a desktop extension you can one-click install. And this is because people use different tools for different things.
And sometimes you're using Claude Code and you're building your own personal CRM or perhaps what I'm often doing is actually as a product developer, I'll interview a lot of people and so I'll say, Claude, hey, pull in my last three conversations with customers, run them through the customer profiles and try to figure out how do we fix the issue that they have.
And it's nice because it has access to this information and that does come from our philosophy again of this information is yours, so use it where you want to use it. If you are most comfortable in Claude, like why do you have to ask all your questions within Quill? And then we have it go the other direction too. We have Claude, we have actually quite a large set of tools; anything that the agent inside of Quill can do gets exposed through this MCP server.
So we can update your dictionary if you say someone's name is Atrof and it was mistranscribed, say, hey correct this and remember it for next time and it will start adding it to the dictionary within Quill. And where we're going with this is also our automation system is the next version of it that I'm working on right now is basically going to be editable by coding agents. And so you can do much more complex automation.
And I think this will work with enterprises again quite well because a lot of our enterprises say Quill's great, but it's got so many things it can do. When I deploy it to my team, I want to set it up in the right way. I want to be sending things to transcripts to our internal webhooks, and I want it filtered to only sales meetings, and I want my custom template to run on every sales meeting and all this stuff.
And so I suspect we're going to this future where everyone is using coding agents to do a lot,
Joel Moses (16:31.067)
Mm-hmm.
Michael Daugherty (Quill)
so this will make it lot more expressive. This is not quite out yet.
Michael Daugherty (Quill) (16:34.682)
Probably beta testing for the next month and a half or so, but I think it's going to make Quill a lot more powerful.
Joel Moses (16:41.36)
That sounds great.
Lori MacVittie (16:41.775)
I also liked the kind of the recognition. I mean transcripts, generating a transcript, is basically just using AI to automate something that someone used to do.
Michael Daugherty (Quill) (16:52.985)
Mm-hmm.
Lori MacVittie
A person could do that or take notes, right? That's automation and that's where we are. And that's good. But like you took a step forward and said, well, we have AI, what else could we do with it? How could we make it better? It's kind of the difference between
Lori MacVittie (17:08.003)
using AI for automation and using it for like innovation. Like here's a totally new thing. And I like that approach because we should start with automation, but eventually, right, we can do so much more. We just have to consider the possibilities, I guess.
Joel Moses (17:22.555)
Yeah.
Lori MacVittie
So, but not right now because we're almost out of time. So yeah, no, whatever it was, Joel, send it to me in a, you know, have somebody transcribe it, send it over. But you know, so what,
Lori MacVittie (17:37.525)
so if somebody's listening, other than, hey, this is a really cool option, like what should they take away in general that they can apply? What do you, do you want to start Michael or you want Joel to go so you have a moment?
Michael Daugherty (Quill) (17:48.027)
Sure. I can start. Maybe we bounce back and forth a little bit
Lori MacVittie (17:54.06)
Okay, okay.
Michael Daugherty (Quill)
because I'm sure Joel's thoughts will also spark some of mine. But you know, there are different profiles of people that might be listening to this, which is probably why everyone needs their own Quill to get their own takeaways.
Lori MacVittie (18:01.803)
Haha, nice.
Michael Daugherty (Quill)
But one thing I think about as a developer is like, don't be afraid to push things out to the edge.
Michael Daugherty (Quill) (18:18.052)
I think the model capabilities, especially open source models, are increasing very fast as well as software, sorry, hardware is getting better. And you can see the hardware providers like Apple making bets on a hardware guy as their new CEO, right? And so I think this capability line is going to continue to increase.
And you can both give users, you know, respect your users, but also you can potentially improve some of their latency, their performance, et cetera, by distributing your inference loads. And to me, that's been one of the more interesting things to think about as we built Quill.
Joel Moses (18:56.432)
Yeah, I kind of summarize Quill as kind of the--as I usually do I reach back into movie science fiction--they're basically helping you create the Tron identity disk. The thing that you stick on your back and you carry with you and it records everything that ever happens to you. And then you get to choose exactly how that is applied and how it is used.
But it always stays with you. And I think that's interesting. I think we're entering a phase right now where AI isn't just about capability; it's about where to set the boundaries. And I think local-first tools like this allow you to create a boundary that says this information I want to keep private, but I want to use an LLM. I want to use AI to do this thing.
So I'm going to carve off a little section of this identity disc and I'm going to have the LLM process that. So it's not just about, can it be done? If it was only about capability, we'd upload our entire lives into the cloud and be perfectly comfortable with that. I think we're entering a phase now where we have to examine what the boundaries are. Both for privacy, and for regulation, and for all the other stuff that matters.
Michael Daugherty (Quill) (20:05.465)
And actually, one more takeaway related to this idea of the identity disk is that the context is actually the competitive advantage in AI right now. And so during this time period is when you're starting to build the context that's going to become more powerful over the next few years because people are starting to realize this context is particularly interesting.
And you can, and I think where, where I think about Quill in relation to this is like building that context yourself so that it is your identity disk. And you can take that to the next model provider, et cetera. Because as you see, the model providers are all passing this capability, this line where their capability is very good. And it becomes easier and easier as they're smarter and smarter to switch between models that you might be using. But the thing that they'll never copy is your context and your judgment and your experiences.
Joel Moses (21:00.134)
Your individuality.
Michael Daugherty (Quill)
Mm-hmm
Joel Moses
Yeah.
Lori MacVittie (21:04.033)
Absolutely. I like that. So takeaways, read the privacy policies, read your terms of service, understand what it is you're sharing, and then maybe make a decision on that. Tools exist that allow you to do local first. I think that's good because AI is not choosing sides. It's running everywhere and it can run local and sometimes maybe it should. So that is a wrap for Pop Goes the Stack.
If today made you reconsider where your modern stack actually lives, go ahead and hit subscribe. We'll keep questioning progress, one architecture diagram at a time. Until next time, keep your latency low, your control close, and maybe don't unplug that old rack just yet.