High Agency: The Podcast for AI Builders

In this episode, we sit down with Michael Royzen, CEO and co-founder of Phind. Michael shares insights from his journey in building the first LLM-based search engine for developers, the challenges of creating reliable AI models, and his vision for how AI will transform the work of developers in the near future.

Tune in to discover the groundbreaking advancements and practical implications of AI technology in coding and beyond.

I hope you enjoy the conversation and if you do, please subscribe!

--------------------------------------------------------------------------------------------------------------------------------------------------
Humanloop is an Integrated Development Environment for Large Language Models. It enables product teams to develop LLM-based applications that are reliable and scalable. To find out more go to humanloop.com

What is High Agency: The Podcast for AI Builders?

High Agency is the podcast for AI builders. If you’re trying to understand how to successfully build AI products with Large Language Models and Generative AI then this podcast is made for you. Each week we interview leaders at companies building on the frontier who have already succeeded with AI in production. We share their stories, lessons and playbooks so you can build more quickly and with confidence.

AI is moving incredibly fast and no-one is truly an expert yet, High Agency is for people who are learning by doing and will share knowledge through the community.

Where to find us: https://hubs.ly/Q02z2HR40

Speaker 1: 00:00

We started like thinking as the AI. We were like, okay, if some human came to me and they asked me what I'm asking of this model right now with all the information I'm giving it, could I as a human reliably produce the right answer? And Paul Graham just kind of looks at Jessica and he's like, you know, our grandkids aren't gonna know what a search result is. They're just gonna have answers. Running our own models was a way for us to figure out like, hey, how can we make a custom coding model that will run fast and that people can use most of their time as the default?

Speaker 2: 00:39

This is high agency podcast for AI builders. I'm Ravi. So I am delighted to be joined today by Michael Roizen. Who's both a close friend and also a bit of a prodigy when it comes to machine learning. He was one of the younger interns at Microsoft Research as a 17 year old, started his first machine learning research company straight after that, and he's now the CEO and cofounder of FIND, which is an answer engine for software engineers.

Speaker 2: 01:02

They were the first company in the world to have an LM based search engine available, and they're backed by Paul Graham and y Combinator. And now over 90,000,000 answers have gone through find, and they have millions of active users. So, Michael, it's a pleasure to have you on the show.

Speaker 1: 01:17

Thank you, Raza. It's great to be here.

Speaker 2: 01:18

So, Michael, to start with, for our audience, just help them understand what is find? How do you describe it to someone new?

Speaker 1: 01:23

Yeah. So find is a tool that helps developers get from an idea in their head to a working product. That's the core of what we do. And so we started really as an answer engine, like you described, when we're actually the very first LLM powered answer engine where a user puts in, like, hey. How do I do this?

Speaker 1: 01:41

Or, hey. Like, this isn't working. And we'll do an Internet search. We'll figure out what's relevant to answering your question, and then we'll use a large language model to synthesize all that information into a concise and hopefully correct answer to your question. And now we're moving more broadly into how can we integrate that into your workflow as a developer as deeply as possible to really help you go from, hey.

Speaker 1: 02:04

I wanna build this thing and, like, helping you actually have it built.

Speaker 2: 02:08

Okay. So idea to product could mean a lot of different things. Like, give me an example. Walk me through a particular use case.

Speaker 1: 02:14

So you say that, like, hey. Like, I want to build a Next. Js app that takes in a user's question, and I want it to go out and get all these sources, synthesize them, and send them to an LLM. Like, one of our internal tests is actually, like, being able to build find using find. And so we kind of enjoy that one because it's a measure of how usable the product is.

Speaker 2: 02:37

So the idea is a developer should be able to come in with an idea, say to find, hey. I wanna build x, and then it's gonna try and instruct them on how to do that. It's not gonna build it for them necessarily, but it'll maybe write some of the code and talk them through the process.

Speaker 1: 02:50

So the current product that's available online is really kind of more chat based, but we're working on something. By the time this is released, I think we'll have something to say about it publicly. We're working on something that will actually write the code for you, like, actually create the files and test them and, like, orchestrate all of that. And so, you know, we started with this more chat based UX because it was very natural, and that was kind of what the technology was limited to at the time. But now very quickly, we're moving towards having the LLM kind of like a Tesla autopilot take over more and more of the dirty work and the stuff that, you know, we as programmers don't really find fun, like fixing things that aren't working and, like, instead shifting the fun things like design and figuring out, like, how this is going to work over to the human.

Speaker 1: 03:34

So the human can still do the things that they find fun, and it'll kind of be like the Tesla autopilot for getting your project working rather than, say, like, a Waymo type analogy that, you know, just completely doesn't ask you for feedback at all.

Speaker 2: 03:46

Okay. So it's a coding assistant that's gonna help me go from idea to develop product. Obviously, there's been a explosion of different AI coding tools recently, starting from GitHub Copilot, which I think was the first really big commercially successful L and M product. But now there's Cursor, there's Cody, there's a whole bunch of things out there. How is find different to those?

Speaker 2: 04:06

What are you trying to do that is sets you apart?

Speaker 1: 04:09

So what we're doing right now is we're really reimagining what coding should look like from the ground up. So I think those other companies have done things that are pretty interesting, but I think they're fundamentally limited by being shoehorned into an existing IDE. What we're working on right now is really taking the find magic and having a native app on your computer built from the ground up for this, like, AI native idea to product workflow rather than being limited by kind of what people currently think an IDE is. We have the ability to completely reimagine all of this from an AI first perspective because, like, our hypothesis here is that, like, what we're really kind of limited by is the AI and AI progress, and it's on this very steep improvement curve. Whereas, like, you know, the IDE's already been invented.

Speaker 1: 04:55

Like, we don't want to reinvent the IDE as it is. Like, we wanna be able to just kind of take the first principles experience that we wanna provide and then rethink it completely from the ground up. So that's what we've been working on.

Speaker 2: 05:06

So if you are gonna build an IDE from the ground up for developers, so kind of an AI first from scratch IDE. Obviously, that's gonna come with a lot of cost in the sense that developers are already familiar with their existing IDE. There's a ton of keyboard shortcuts. They, like, understand how to work within them. What are you imagining with first principles from AI that you can do differently, or, like, it's gonna be, like, fundamentally different as a product experience that makes you think it's justified to, like, reinvent the whole product rather than integrate the AI into the existing workflow?

Speaker 1: 05:37

The biggest idea here is that developers will be writing code less, and IDEs are basically solely optimized for writing code. And IDE is designed purely for, like, okay. Here's this code I wanna write. How can I write it? And what we're a lot more interested in and where we think that like, the thing that will make developers superhuman is helping them think better and focusing on that, focusing on, like, helping developers like themselves kind of fully unlock their thoughts and go from, like, an idea to an understanding much faster.

Speaker 1: 06:11

And then we'll write a lot of the code for them. So the kind of the big insight here is that we think that, like, yeah, the actual code writing should not be the focus of this product that a human's sitting in front of. The focus should be on, like, how can we help pull, like, all of the creative juices out of you and help you understand your own ideas better and formalize them and do the algorithm design and all of that? Because, like, that's the part that we find fun really as developers. And that's still really hard.

Speaker 1: 06:39

Like, you can't, you know, code anyway if you can't have that. And it's something that, like, AI currently isn't good at either. So that's kind of like what we think will stay true in the long term is that, like, humans will continue to be better than AI at these high level, how do I do this type tasks. And, like, sure, the LLM can help pull it out of the human, and that's what we're really kind of working on. But, like, if the LOM can truly invent new products as well as humans can, then, like, once we get there, then there's all sorts of other wide ranging implications for society.

Speaker 1: 07:11

So we won't get there for quite a bit is one of our bets.

Speaker 2: 07:14

So I'm kinda coming along with you, but so it sounds like the argument is that coding is only part of what you're doing when you're doing software development. The IDE is built around helping you write code. It's a text editor with a bunch of extra features, and you're saying, hey. A lot of the actual, like, producing the text of the code, we can automate a lot of that. The hard part is the thinking and the coming up with the product and solving problems.

Speaker 2: 07:37

So assume that I ground that premise. I just believe you on that. How will the product be different from a traditional IDE such that it enables this?

Speaker 1: 07:44

Sure. It's focused a lot on our conversation with the AI and also real time previews that are not really possible with an IDE because so much kind of space and focus is on the code itself. And so, like, what we wanna put disproportionate focus on versus traditional IDE, like, the user themselves has a hypothesis. Like, they're building this app. They're trying to test their own hypotheses as quickly as they can.

Speaker 1: 08:04

How can we facilitate that through a product? So we have these user interfaces that make, like, these very fast, like, UI and functional mocks. The user can very quickly kind of play with, interact, test, get feedback on. And that's kind of a very big part of the the UX when using the NuFind experience versus solely this IDE US. Mostly code, maybe it has like a chat bar on the side.

Speaker 1: 08:25

I think what those products are really missing is kind of, like, a more real time, like, interactive experience that lets you really feel it and really use the product and, like, tighten that iteration loop.

Speaker 2: 08:35

Now I'm beginning to be able to picture it. So it's actually saying there should be something much more interactive here. I think Anthropic recently had, like, the artifacts product, and it sounds like it's got bits of that kind of intuition in it where I wanna be able to have an idea, see a render of it, interact with it.

Speaker 1: 08:50

Exactly. And, like, clone artifacts, I think, is actually a brilliant product. You know? I use it myself for certain things. But I think it's clear by now that what developers really want is something that's, like, deeply integrated into their workflow.

Speaker 1: 09:01

Time and time again, we hear from our users that, like, you know, they don't like going to the web. They don't like switching tabs, and that's why, like, we have a Versus Code extension.

Speaker 2: 09:08

So, Michael, one of the reasons I wanted to interview you in particular was how early you were to the space. You guys built the first, quote, unquote, answer engine, the first system that used LM to do question answering that was widely publicly available, well ahead of perplexity, well ahead of others. And you did it while still a college student and in a world where the technology was very different. You didn't have l m APIs yet. So I'm kinda curious when you started, what did it take to build a product then?

Speaker 2: 09:33

How has that changed?

Speaker 1: 09:34

Yeah. It was it was like the Wild West back then. I started on this in the fall of 2020. Shortly after, I had played around with the Hugging Face transformers library, which I think had come out a couple months earlier. And so I was playing around with all of these BERT models, and I was building an invoice, like, unstructured extraction tool with all these BERT models.

Speaker 1: 09:56

And I was really fascinated by, like, the fact that these models can, like, classify text super well, but they can't really write text. And the state of models that could write text at the time, like GPT 2, wrote, in my opinion, like complete gibberish. I think g fifty three had also come out at the time. I think g p t three also came out, and I tried it out. I wasn't really impressed by it.

Speaker 1: 10:16

I wasn't like, this isn't, like, that much better than g p t 2, which I know is a bit of a hot take because, like, obviously, it's, like,

Speaker 2: 10:23

a whole

Speaker 1: 10:23

like, I understand all the research and the hard work that went into it, and I I definitely appreciated that. And I I noticed that the text that it generated was a lot more coherent, but it was, like, still fundamentally as useless as GPT 2 because it couldn't answer questions, which was the use case that I was really excited about. And so I went really deep down this rabbit hole figuring out, like, well, how can we have these large language models answer questions?

Speaker 2: 10:46

And you and when you say couldn't answer questions, you mean factual questions because the model only knows what's in its pre training data set. So if I ask a question that's outside of that, and also, I guess this was before instruction tuning. So

Speaker 1: 10:58

Right.

Speaker 2: 10:58

There was no reason for the model to, like, try to be accurate. They were just next word prediction machines at the time.

Speaker 1: 11:04

Exactly. There was no, like, widespread notion of instruction tuning at least outside of the research labs at the time. And so even when you you shotted the model, like, you provided a couple examples of, like, here's what I want, you know, a sample response to look like, it was still very iffy. Like, it would go off the rails all the time. Like, those models really wanted to do, and what they were trained to do is just, like, tell stories and repeat Internet text.

Speaker 1: 11:28

And they were quite good at that. Like, if you wanted it if you could kind of prompt it in such a way that it could, like generate a story or generate even like an email, it was decent at those things. But like even with a lot of tuning, it was very difficult for it to be good at answering any sort of question, especially reliably. And so I was kind of intrigued by this. But what really made me obsessed with this space is when I saw a demo by this X META researcher who I think was at Hugging Face at the time.

Speaker 1: 11:57

His name was Yacine Jarmit, and he made this demo of using a BART model, which was this encoder decoder model released by Meta. 500,000,000 parameters. So very small by today's standards. And he fine tuned it on Reddit to basically follow this instruction format of wanting to answer in response to a question. And then he also connected it to a RAG system.

Speaker 1: 12:22

So he set it up so that it would, given a question, would perform a lookup on Wikipedia, give, like, the 10 most highly ranked results back. And the funny thing is it was super well implemented. He implemented it both using sparse retrieval, using something like Elasticsearch, and also dense retrieval using Meta's dense vector retrieval database at the time. And it still flew off the rails a lot, but I was intrigued enough by this problem to kind of take that as a starting point and to kind of see, like, how far I could push this. And so a couple months go by, and I I tune the model myself.

Speaker 1: 12:58

I do some work on the document retrieval system. I kind of set this up end to end. And not that much progress. Like, I couldn't make it that much better. Like, I made it a little bit better, but not that much better.

Speaker 1: 13:09

And so I shelved this idea for almost a year. I still have, like, a year and a half left in college at this point. I started working with this professor at UT named Greg Durett. I came to him in the fall of 2021, almost a year later. And I was like, hey.

Speaker 1: 13:24

You know, there's been some development in the space. I worked on it kind of like previously at like a year ago. But now there are these new models called t zero, which is a derivative of the t five model released by Google in 2020. But it this model is fine tuned from the get go to be a lot better at following instructions for answering different kinds of questions.

Speaker 2: 13:46

And this was one of the first papers, if I recall correctly, that, like, was trying to frame all NLP tasks, all natural language AI tasks, as just text in, text out tasks. Like, the previous paradigm, you would have, like, custom models for every use case. For listeners who, like, have joined us in the post LLM world, where we have very these very general purpose models, it's maybe difficult to appreciate that even just 4 years ago, that was a very new idea. So previously, if you wanna do classification, if you wanna do NER, if you wanna do text extraction, you would build a custom dataset for every task and a custom model for it. And I think t five was one of the first papers, if not the first, where they said, Hey, why don't we take this mindset where all of these tasks, whether it's classification or any r or question answering or summarization, they're actually all just tasks that take some form of text in and they spit text out, and we can frame all of these different problems with one model.

Speaker 2: 14:35

Again, it's very easy to take that for granted today because it's become what everyone does. But I think at the time, it was extremely novel.

Speaker 1: 14:43

Yes. And this paper is kind of what made me realize that now it's actually possible, I think, to build a useful product of this type. And what was so new about this about t 5 was the fact that kind of like you said, Raza, they created a singular dataset that had many, many different kinds of prompts across many different kinds of tasks. So it was one model that was fine tuned for, like, a very kind of diverse set of question answering tasks. And even, like, text extraction tasks, and I think some other tasks were in there as well.

Speaker 1: 15:14

And it was also larger than BART significantly. So unlike BART, which was, like, roughly, like, a 500,000,000 parameter model and the largest size, this model was there are 2 models. There's a 3,000,000,000 parameter model. And so the larger size, I think, helped a lot as well with making the model more coherent and staying on the rails, even though it's tiny by today's standards. And so the model, because it was less likely to go off the rails, just to begin with, I took inspiration from Yacine's strategy of really kind of focusing on Reddit.

Speaker 1: 15:48

And so we focused it on 2 categories. I focused it on Reddit and Stack Overflow and Stack Exchange questions to try to have just a large dataset of kind of a more singular format on top of all of the work that they already did with t zero. And so I fine tuned it for that very specific format to improve the performance for specifically just for, like, general question answering tasks. And then the other thing I did was really on the retrieval system. So I wanted this to be able to retrieve information from basically the entire Internet.

Speaker 1: 16:18

And, like, I didn't wanna, like, cop out by just using the Bing API. So I started with a common crawl dump, and then I downloaded all of common crawl, which is a Common Crawl is basically a bread first kind of dump of the Internet. So for any given website, they might not necessarily have the whole website, but they include most of the publicly accessible websites on the Internet in this dump. It's like 3 terabytes worth of text or something ridiculous like that. And so wanted to keep only, say, like, the top 10,000 websites because I guess that just that's where most questions would be answered from.

Speaker 1: 16:52

And then I got rid of the long tail and focused just on keeping the common crawl from the first 10,000 websites and then indexing those using an Elasticsearch cluster. And I was able to get it running on, like, a relatively small EC 2 instance on AWS. One instance that you ask it a question and it's able to retrieve relevant documents from, you know, the most popular websites on Internet that contain the answers to that question. And when I put that system together end to end for the first time, at the end of 2021, at the beginning of 2022, I did a show h n on Hacker News in January 2022 being like, hey. Like, I put this thing together, and, we got some mixed feedback.

Speaker 1: 17:30

People who tried it and it worked were like, wow. This is, like, game changing. And then other people, like, didn't get what they wanted, and they

Speaker 2: 17:36

were like, yeah.

Speaker 1: 17:36

I don't know about this. But, like, when it did work, it was so magical that it gave me I was obsessed with this idea. Like, I couldn't sleep. I couldn't stop thinking about it. And so I applied to YC.

Speaker 1: 17:49

My app was basically like, hey. Look. This is how people are going to find information. And the central question that I was struggling with and what, you know, people were asking me too, like, I was having, like, coffee chats with some friends, some who were in the VC business. And they're like, yeah.

Speaker 1: 18:01

Who's this for? And I'm like, I don't know. It's like for me? You know, I kind of like like Wikipedia, you know, type style searches and stuff, but I don't really know if, like, that's, like, a long term winnable business versus Google. And so I decided around that time to focus this on software developers.

Speaker 1: 18:18

And this is kind of like a twofold hypothesis. The first one was that we can help software engineers today and at the time by basically being like a better Stack Overflow. Like, they have a question that can be answered from looking at something on the Internet. Boom. We can answer it.

Speaker 1: 18:34

And the second hypothesis was that kind of like the Paul Graham hypothesis that he wrote an essay about all the way back in 2012, which was that perhaps it is possible to displace Google. And, like, the way to do that would be to first make, like, a search engine that all of the developers use. So that's what Google actually did. Google got all the hackers to use Google, and then they kind of opened the scope later rather than going with this overbroad where everything to everyone all at once. And so we focused on developers, got into YC.

Speaker 1: 19:04

And immediately after college, like 2 weeks later, we started that, relaunched as, you know, initially hello, and then we rebranded to find. But ever since the summer of 22, we're basically doing what we're doing now. So that's the a long winded tour of our, of our origin story.

Speaker 2: 19:22

Before we dive into the technical details of how find works, one thing I'm curious about, Paul Graham, the founder of YCPG, doesn't do that many, like, active investments anymore. He's not a partner at YC most of the time. You got you guys are one of the rare companies that he's personally invested in. Like, how did that happen? How did you convince him to invest?

Speaker 2: 19:42

I'm kind of personally just very curious about this.

Speaker 1: 19:44

Well, we basically told him about the vision, and we showed it to him. And we're like, hey. Like, people don't want links. People want answers. This is the future.

Speaker 1: 19:52

And he was like, that was like, I looking at him, I could see that the gears were turning in his head. He was like, yes. I think he was happily surprised by it. And the way that it worked out, I chose, like, the last office hour spot spot that he had that day, that he was doing in person. And so he got so excited talking about this with us.

Speaker 1: 20:12

First of all, I just, like, asked him to invest on the spot. Like, I could see the gears are turning. I was like, hey. Do you wanna invest? And he's like, yes.

Speaker 1: 20:17

Of course, I wanna invest. Like, this is so cool. And then he was like, hey. He checked his watch. It was, like, 7 or 7:30.

Speaker 1: 20:24

He was like, hey. Like, I have to go, like, home to Jessica. I have to cook dinner. But do you guys wanna come with? And so we're like, of course, we wanna come with.

Speaker 1: 20:32

And so he takes us to his his lovely home, and we kind of, like, sit down. And Jessica's there too, so we gotta meet Jessica. And Jessica's absolutely wonderful. I think, like, she does not get nearly enough credit for, like, her role as literally the cofounder of YC. But she's amazing.

Speaker 1: 20:48

And we're we're sitting at, like, their backyard table, and Paul Graham just kind of looks at Jessica. And he's like, you know, our grandkids aren't gonna know what a search result is. They're just gonna have answers. And this is kind of like a mind blowing moment for everybody because, like, this was pre chat gpt. This was, like, long before LLMs have been kind of, like, known to answer questions, you know, in a way that that people were consuming.

Speaker 1: 21:14

And so, like, we all just got super excited about this idea of, like, hey. We can answer questions directly instead of just sending people to links.

Speaker 2: 21:23

Since then, it feels like you guys have shifted the vision somewhat from being like a search engine for developers with a view to taking on Google. I'm I'm kinda curious. What do you think of perplexity? It's got a huge valuation. It's been growing really fast.

Speaker 2: 21:36

It is closer maybe to the original find thesis than you guys are now. Do you think it's gonna be successful?

Speaker 1: 21:42

There's a couple of things that happened that I think shifted my perspective on the whole industry. The first thing is the release of chat gbt. And I think that's really the main catalyst that made it a lot more difficult for a startup to compete in the space because it was so much better than everything else for so long that ChatGPT is kind of, like, what became known as, like, the original kind of, like, AI answer engine. And what that caused was the famous code red alert at Google. Like, it forced everyone to be like, we have to get on this.

Speaker 1: 22:13

Like, one of our original hypotheses is that, like, Google is gonna be, like, very slow to market, not just because, like, they have all these safety bureaucracy issues inside the company, but also, like, there's political reasons as well why they wouldn't want, like, their AI saying things that could be wrong or offensive.

Speaker 2: 22:28

And I guess I guess for Google also, it's a classic innovator's dilemma as well. Right? They get a huge amount of their revenue through research results.

Speaker 1: 22:36

ChatGPT success blew all of that up immediately. And so we see now, like, Google is building generative LLM results directly into directly into search. And, yes, like, they still have a lot of work to do on improving the quality of that. But, like, for fundamentally building, like, a generic answer engine that just, like, synthesizes stuff from the Internet, reflexi style or Google AI overview style, and then kind of gives that answer in a general way, like, everything to everybody, that's not a game I think a start up can win. Because Google, in their, like, AI overview announcement, even they themselves are like, yeah.

Speaker 1: 23:13

We, like, were able to lower the cost 80% over the last year. It's like all of the blockers to them doing this well, cost mainly have just basically gone away. And as models have gotten better, it's very technically simple now to have a small, efficient, cheap, fast model that just synthesizes all this data and produces, like, basic generic answer in an instant.

Speaker 2: 23:35

So you would predict that Google beats perplexity in this race?

Speaker 1: 23:38

Yeah. I don't think it's winnable by a startup.

Speaker 2: 23:40

Interesting. So what do you think what do you think is gonna happen to perplexity?

Speaker 1: 23:43

I mean I mean, I don't have a crystal ball. It depends on what they wanna be. My general prediction is that, like, because of this rat race that was caused by chat gpt, LLMs themselves have been commoditized a lot faster than I would have predicted pre chat gpt. And what that's resulted in is, like, I don't think as a start up, you can be everything to everybody. I think you have to choose who you want to serve.

Speaker 1: 24:04

I think that there's a lot of valid use cases in, like, say, serving researchers, serving people, like looking for products and, like, building beautiful, bespoke, highly optimized experiences just for those verticals. So I'm a huge believer that there will be, like, vertical search and, like, vertical AI chat, like, applications that take a domain, do it extremely well, and then, like, they can be basically untouchable because then, like, other companies have, like, innovator's dilemmas basically for competing with them. And also it's a matter of focus as a startup. Like, you really need to be laser focused on, like, one specific thing or you're just gonna kind of build, like, a mid generic thing for a lot of people. And so I think that, like, perplexity has raised kind of enough money where they're able to sustain this burn, where they're offering these kind of slightly know, more advanced models than Google is currently offering in their AI overuse product.

Speaker 1: 24:57

And they also enable follow-up chat, which Google AI overviews currently does not. But I just don't see how this is sustainable in the long run. The problem is Google is already entrenched on Chrome and on iOS, and they have all these deals and, like, Google's already the default everywhere. And this is a problem that they're clearly working on. And all of the previous innovator's dilemma that they had, like, it doesn't exist anymore.

Speaker 1: 25:23

Like, they're clearly trying to compete in the space. So all of it comes down to like, will they execute? And of course, like, that is like the ultimate perplexity question. But I think like at the end of the day, the unfortunate reality is that perplexity can still be better and Google can still win just because of the platform advantage. The delta of the improvement has to be significant for people to be willing to go through, like, platform switching costs to be able to, like, en masse, like, the general population to be able to switch over.

Speaker 1: 25:52

And that's the fundamental problem is, like, the single biggest reason why I started working on find, like, and did the YC leap and all of that is because of the Delta Leap answer like AI generated answers versus no answers is massive. Massive delta. That's like a 10 x at least improvement when done well. But, like, a slightly better LLM answer versus a slightly worse LLM answer, I don't know. Especially when the slightly worse LLM answer is literally everywhere, and it's already on all of the tools, like, you use.

Speaker 1: 26:23

I'm almost ashamed to admit that, like, I myself, like, will sometimes just open Safari and type something in and, like, boom. It's there. It's instant. Like, I don't even use my product sometimes. I don't even use say perplexity.

Speaker 1: 26:33

Like, sometimes it's just there. It's just the fastest. And so, like, long story short, I think it comes out to a matter of focus. Those who are focused will have a higher chance of succeeding. And for the record, I wish them the best.

Speaker 1: 26:45

Right? I think it's it's a very interesting problem. It's a problem we're solving. I wish them the best. And I hope that, you know, there's an angle here that will be interesting.

Speaker 2: 26:53

I'd love to get into the technical details now. So how does find work? What's going on under the hood?

Speaker 1: 26:59

Yeah. The fundamental architecture is very simple, and it's very similar to what it used to be back in the day where we have this retrieval step where we get web results that we think are relevant. And, of course, you know, there's some stuff that goes into that. So we do some intelligent query rewriting. So we have, like, a very fast and small LLM model that takes your query.

Speaker 1: 27:18

It reformulates it in a way that we think might be relevant, and, like, it's also optimized specifically for technical searches. So that's, you know, how we use it to help make sure that the best technical sources are in the search results. And then we also do a lot of some other pre pre computation on the fly. So we also decide, like, is one search sufficient or do we need to do multiple searches? So we have, like, this auto multi search mode that basically is runs from this classifier.

Speaker 1: 27:44

And so if it does, then we do multiple searches, and then we aggregate all that information. We feed all that information to an embeddings model that, you know, we've customized over the years, and this embeddings model takes in this technical information. It's tuned specifically to be able to perform well with code, and it's also tuned for high throughput and speed. So we actually pair like, for every single request that comes in, we actually do up to, like, 8 way parallelization for the embeddings to make it significantly faster. So, like, our goal is, like, for the embedding step to be able to complete in, like, 100 milliseconds, basically, with quite a bit of text comes through it.

Speaker 1: 28:18

Like, we pull sometimes a lot of sources, and we need to, like, organize it in, like, a 100 milliseconds or less. So that was a really fun technical challenge, like, figuring out how to, like, get that timing down. And then we form the context that we sent to the model, and then we send it to the model. And we have we run both GPT models and CloudSonic now as well as our own custom models that serve the vast majority of the traffic on the platform. And so our own custom models, that's been quite a journey developing those as well.

Speaker 1: 28:45

You know, we started out with our own models, like, way back in the day just because, you know, OpenAI didn't have anything that would work. Yeah. Like, there's just no API models that could serve our needs in mid 2022. And after GPT 4 came out, everyone kind of lost their minds, including us. And we're like, okay.

Speaker 1: 29:02

We have to put this into the product. And for a while, like, the product was primarily GPT 4 based, particularly as, like, a technical product. And actually, us putting GPT 4 into the product resulted in, I think, like, one of the highest upvoted hacker news post of all time. Like, we got something like 1400 upvotes on a product launch, which is, like, very high. That was very exciting to see.

Speaker 1: 29:20

And for a while, we're GPT 4 only, but we had people always asking us, like, why is it so slow? Like, I want it to be faster. Like, I don't have to grab a cup of coffee every time I, like, answer a question. And also it's very expensive for us to run. And so running our own models was a way for us to figure out, like, hey.

Speaker 1: 29:36

How can we make a custom coding model? We don't have to worry about other use cases. So, like, that focus really kind of simplified this for us. How can we make just a coding model that will run fast and that people can use most of their time as the default? And that's why we ended up building up the find models, which are based on lama derivatives.

Speaker 1: 29:56

And so today, we have 2 different find models. We have a tune of the llama 3, 8,000,000,000 parameter model, which is designed to be just kind of serve the original vision of Find, which is, like, how quickly can we just pull everything and, like, write kind of a Stack Overflow or documentation style summary of web text. And then we have a larger 70,000,000,000 parameter model whose goal is to be, like, the best model for most things, generally speaking, as a balance between speed and answer quality. And, of course, you know, now we're very excited about, you know, llama 400 b on the horizon. And so I think that's kind of like a just a general summary of of how everything works at the end.

Speaker 1: 30:36

But, yeah, I think it's there's also, like, a lot of interesting stuff happening in models that I think could be interesting to discuss as well.

Speaker 2: 30:42

Yeah. Let me let me just make sure I've understood it correctly, and then I would love to chat about the model part of it. But it's fundamentally a a rack pipeline. There's a search index. Search query comes in.

Speaker 2: 30:51

You're doing a search, retrieving that, embedding it, summarizing the context, and then providing that to the model. But there's a lot of nuance, it sounds like, in order to make it work well. So, actually, you're rewriting the question after the question comes in to make it better for technical questions and to make the search better. You're doing, like, some both dense and non dense search. You're optimizing the speed of the embeddings that you can do fast queries, and you have intermediate models that are figuring out how to do some of those things, and then you're making a choice about which model is finally used to generate the final answer.

Speaker 2: 31:23

One thing that I would love to hear a little bit more about is you said, you know, we retrieved this context, and then we put it in the context window, and then we generate you an answer. How much prompt engineering goes on? How much do you have to iterate over how you structure the context, or are you just fine tuning? Like, how do you make good answers come out of this system?

Speaker 1: 31:40

Funny enough, like, not that much prompt tuning is required. And particularly for our own models, like, they're not trained to be, like, diverse models. They're only trained on, like, a handful of prompts, mostly technical.

Speaker 2: 31:53

So they're fine tuned on input output pairs for your use case?

Speaker 1: 31:57

Right. Exactly. Okay. So we did a

Speaker 2: 31:58

lot of work. Fine tuning rather than prompt engineering?

Speaker 1: 32:01

Yeah. We did a lot of work at the fine tuning level to get these models to respond the way that we wanted. And we did, I think, some clever things at the fine tuning stage as well. We don't really want to train on, like, other model synthetic outputs because, like, they're frequently wrong. Code produced, like, by models, it will have bugs in it.

Speaker 1: 32:20

And so our strategy for producing training data was to start with the code and use that as, like, the label, as, like, the gold standard that we're going to train the model to predict the human written code that we already know is right. And then generate a synthetic input to create, like, an input output example that we can train on. So rather than having an input and then kind of generating synthetic data for the output, rather we start with the output that we know is correct, and then we generate synthetic input to create the pair. And it turns out that also generating synthetic data for the input is a much simpler task than generating synthetic data for the output because, like, models are, like, today's models are fantastic summarizers. They do summarization very well with, like, very little hallucination.

Speaker 1: 33:06

So giving it, like, a piece of code and and this takes some prompt tuning obviously, but, like, saying, like, hey, Like, write, like, a sample input that has all the information necessary to recreate this piece of code. That actually works surprisingly well.

Speaker 2: 33:18

So it feels like there's maybe a a generalizable lesson there, which is that if you are trying to use models to create synthetic datasets, it's much easier to start from a dataset of answers and generate accurate questions than it is to start from a dataset of questions and generate the answer.

Speaker 1: 33:35

Absolutely. And, like, in our experiments, trying to get LLMs to generate high quality questions, that's very difficult. Doing kind of, like, this approach where you say, like, generate me, like, a 100 topics and then taking each of those 100 topics and, like, generate another 100 subtopics. Like, that just does not work super well in our experience.

Speaker 2: 33:52

What other generalizable lessons have you come across in the process of building Phynd? So this strategy for generating synthetic data feels like one. Are there any other things that you've discovered or had to figure out along the way that you think anyone building with AI products, like, could potentially use or that they should try and adopt?

Speaker 1: 34:09

So I think that, like, kind of at the product level, minimize the chances of inherent non determinism of the models, screwing up a user experience that needs to be deterministic. I think that is kind of like the the most kind of generic way to describe the way that we think every day at

Speaker 2: 34:27

find. Wait. So what does that mean? I don't think I fully follow. So, like, where do I wanna try and, like, remove the nondeterminism?

Speaker 2: 34:33

What's an example?

Speaker 1: 34:35

So part of it is, I think, really using test driven development in creating AI applications, like, at the engineering level. So for every single AI invocation that you have in your product, particularly when it, like, absolutely needs to be reliable within kind of a range of acceptable values, First of all, make the AI as self contained as possible. Like, focus is very important for prompt tuning as well. So keeping all of your different subtasks that you might deploy an AI on to as constrained a purpose as possible, Keep the message as unpolluted as you can. And then for each one of those kind of submodules, make sure that it works 90 plus 99 percent of the time specifically for that thing.

Speaker 1: 35:21

And so something that we kind of had to learn the hard way was how do we, yeah, like, actually engineer a product built from the ground up where we have to make sure these modules actually work. And so we wrote these automated tests that need to pass, like, the tests run not once, but, like, 20 times on, like, a single kind of unit every time the tests run, and we calculate the percentage that it passes. And if it falls below kind of the minimum threshold, then we know it's bad. So I think that's kind of like a fundamental principle in software engineering that I think we kind of had to, like, expand on a little bit is, like, how do we actually make reliable components using LLMs? And that's kind of what I meant by the whole nondeterministic component.

Speaker 2: 36:06

Okay. So you use heavy use of evaluation and testing for every subcomponent enough such that they're reaching, like, some minimum performance threshold individually as well as joined together to have confidence in what is otherwise is the question essentially, like, of systems engineering. How do I make a reliable whole out of unreliable parts?

Speaker 1: 36:26

Exactly. And I think it also comes down to focus, just like how you need to be focused as a startup, as a team, etcetera. Like, the AI benefits tremendously from focus. And I think something that we've done is, like, we started, like, thinking as the AI. We were like, okay.

Speaker 1: 36:39

If some human came to me and they asked me what I'm asking of this model right now with all the information I'm giving it, could I, as a human, reliably produce the right answer? And sometimes, like, there's been a couple cases where we're like, okay. Like, we're just feeding the AI, like like, everything. But I, as a human, have no idea what's happening. Like, I have no idea what's going on.

Speaker 1: 37:01

Like, I couldn't answer it myself. At minimum, make sure that it's something that you can do as, like, a human because, like, if you can't, it's definitely not gonna be able to do it.

Speaker 2: 37:09

What's your workflow for improving find? How do you make it better over time?

Speaker 1: 37:13

Yeah. So we have, like, a data flywheel where, you know, data is coming in from what worked, what didn't work. We have all sorts of real time feedback signals coming in from the website. We have, like, the thumbs up, thumbs down. We have, like, AI classifiers that we can also use to determine offline after the fact, like, whether given answer actually, like, answered the question appropriately or if it didn't.

Speaker 1: 37:33

And, like, a lot of, like, self reflection, I think, works as well. Like, we've built datasets where, like, we've had the model, like, reflect on, like, hey, what did it get wrong? How can we fix it? And use that to, like, continuously improve the models.

Speaker 2: 37:47

How do you use that to improve the model? So you gather this feedback data. How do you use it?

Speaker 1: 37:50

So we create new training data by having the models self reflect on, like, things step by step, what they got wrong, and try to do, like, more, like, kinda complex tree of thought style reasoning to try to correct their answers. And that's not perfect. Like, that's still a noisy training method because it's not always possible, but it does help. Like, giving the models more time to think offline and then reincorporating that back into the model where they have or back into kind of real time inference usage where they have less time to think is an effective strategy still. Something that's also been really effective is just, like, at the product level, seeing people use it in person, inviting people over or, like, hosting in person meetups and, like, actually, like, seeing how people use it.

Speaker 1: 38:35

Like, it's remarkable how differently people use it sometimes than, like, how you use it. Like, we have one of our top users who has, like, this trailing Google Doc of, like, all of the searches that he tried and, like, all of the different combinations of queries that he tried as well as their answers. And we're like, wow. That's very interesting. And, like, that gives us, like, ideas on, like, how can we better structure, you know, your thoughts and help you, like, manage your thoughts and explore that further.

Speaker 1: 39:05

And, like, the reason why that was so important is because that feeds directly into our hypothesis about how, like, very often, programmers don't even know themselves what they're trying to do. And so, like, that's very clearly a problem when, like, someone, like, has to have this Google Doc organizing their thoughts. It's like they're trying to use find to help them organize it, but clearly, like, you know, there's it's it's not enough. So I think that focusing on, like, product level experiences like that is also very important. And then finally realizing that, like, the model is not the product.

Speaker 2: 39:33

For a while,

Speaker 1: 39:34

I think that was a very kind of blurry line for us to walk where, like, the model was the most important part of the product. And I think that was certainly true in a, like, pre commodification era. So, like, in the era of, like, there's only one model that is good at programming, like, or answering this specific type of, like, technical questions, then yes. Like, the model is in some ways really the star of the product. But now there are so many models, roughly all approximately like the same capability.

Speaker 1: 40:04

And OpenAI hasn't really fundamentally made their models better in over a year. Like, OpenAI hasn't like, my hot take here is at least for programming. OpenAI models have not fundamentally gotten better since the first GPT 4 release. What they have gotten better at is following instructions. So GPT 4 turbo and GPT 4 o are a lot better at responding in exactly kind of the kind of format that the user wants them to respond.

Speaker 1: 40:34

But these models are actually more error prone to, like, more fundamental logical errors or syntax errors in writing correct code than the original GPT 4. And so what's happened is there's so many models now from OpenAI, now from Claude, even from Meta as well, and, like, the models that we build fine tune from Llama. The model is essentially a commodity. And, like, you have to really, I think, think harder about, like, the product experience as a whole rather than saying, oh, okay. I just got, like, the best answer.

Speaker 1: 41:07

I'm done. Like, I think that, like, now that, like, the delta between models is so little, it's really more about, like, how can we help you as a developer get from idea to product end to end in, like, the least frustrating way possible.

Speaker 2: 41:23

What advice do you have for people building AI products today? Like, if I'm a product manager or an engineer and I'm building an LLM product for the first time, what lessons would you give, you know, someone or a friend who asks you, hey. I'm just starting out on this. What would be your tips?

Speaker 1: 41:39

I would say always, like, work from first principles. Because I think, like, whenever you're in, like, an AI hype cycle, like, it's very easy to be like, oh, this is so cool. Let me build it. And that's a very dangerous thing because, like, what is cool is sometimes not what you need to do to make the product functional for what it needs to do. And so, like, ideally, there's an overlap, but I think people working at AI need to stay really focused and really disciplined on how, like, every single day, how am I, like, end to end actually helping the user with their problem?

Speaker 1: 42:13

Because sometimes people, like, you know, ask me for, like, advice on, like, how I should they should integrate LLMs, and it turns out, like, they shouldn't use LLMs. They should use classical programming techniques. It'll be cheaper. It'll be faster. It'll be deterministic.

Speaker 1: 42:25

Like, I think that you still have to kind of, like, really reason from first principles about, like, is this really necessary for this very specific objective that I'm trying to achieve and working backwards from there. And then, of course, you know, to reiterate, like, if you are working with LLMs, start by prototyping with GPT 4 Claude. Like, don't start by fine tuning your own models in 2024. You know, start with a great product like HumanLoop that can help you figure out, you know, which prompts you should use and, like, help you prototype basically as quickly as possible to figure out, like, can I actually solve this problem using g p four? And then over time, you know, if your product really kind of takes off and you have a lot of volume, then you can kind of get into, like, scaling.

Speaker 1: 43:07

And, like, scaling is like a whole beast of itself where you figure out, okay, like, how can we, like, lower the unit economics of, like, running these models on a per inference level? And at that point, fine tuning really starts to make a lot of sense because it's possible to run like a fine tune model on your own hardware in a very high throughput setting and save 80, 90%, like over GPT 4, even GPT 4 0. But, yeah, like, don't kind of get ahead of yourself with, like, fine tuning models because it's cool. Always focus on, like, how can I actually solve this specific problem for my users, for my customers, and staying obsessed with that all the time?

Speaker 2: 43:45

In every technology wave, there's a distribution of value that accrues to incumbents, to startups, to different parts of the stack. I think people have written about, you know, how in mobile it probably went like more to incumbents than to startups. And maybe when the 1st Internet wave, it went more to startups than to incumbent. Obviously, you're running an AI startup. If you think about this, you're in a space where you do have competition that comes from the larger companies.

Speaker 2: 44:10

So there's GitHub Copilot or, you know, if we take the search engine example, like, perplexity is going up against Google. What's your view on how the value will split for different applications and kind of where it'll accrue between start ups and larger companies?

Speaker 1: 44:24

Great question. I think that as kind of like a recap of the perplexity discussion, I think that in cases where the scope of the product is more broad, the value will accrue disproportionately to the incumbents because the incumbents already have the platform advantage. So, like, Google, for example, for, like, generic search, I think that that is something that where they have a significant advantage. However, for vertical specific applications, that is where start ups, I think, have a much better advantage because you're competing against the big companies more or less on even footing. Like, if it's a very vertical specific application, chances are the big incumbent doesn't have, like, a version of that yet.

Speaker 1: 45:03

So it's like you, the startup, building this new specific thing from the ground up versus the incumbent. And that's where kind of, like, all of the old rules about startup versus incumbents come into play, where, like, the inherent bureaucracies and political infighting and, like, this focus of the big companies really hurt the big companies compared to start ups. So, like, what I expect to see happen is that, like, companies that are really focused on, like, a particular vertical solving a very, like, specific need for specific customers and who keep their eye on that ball will accrue, like, disproportionate returns versus incumbents. But if it's more general, the incumbents are gonna win. And I think it's even more exacerbated by the fact that unique sources of data are are, like, an insane advantage in the AI landscape, particularly when these models are commoditized.

Speaker 1: 45:52

So, like, you know, take, like, Harvey, for example, the legal AI assistant. Their lawyer, like, network, like, that collects their own bespoke data for their use case is not something, like, the big companies can collect as easily. It's just like they have that data. They have that focus. You know, kudos to them.

Speaker 1: 46:11

And, like, even say with, like, Cursor and Versus Code, I think it's very impressive that, you know, Cursor has built this product that GitHub has still not been able to beat despite it being like this Versus Code type tool and Microsoft owning Versus Code. Like, I just I just don't know how they have not been able to kind of match them on that product in that niche. And I think, like, what that really shows is that, like, knowing your users, having focus, having that tight inter iteration cycle, it really helps.

Speaker 2: 46:44

If you picture the world 5 years from now, 2 part question. Broadly, what do you think will be different because of AI? And more narrowly, how will the work of a developer or someone building a software product be different?

Speaker 1: 46:59

So I think more broadly, I think, funny enough, we'll think about it less because it'll be more integrated than it is today. Because today, like, to use AI, you kinda have to think about it. You have to be like, oh, I have to open Chajit T. Oh, I have to open, like, Vine. I have to open Claude.

Speaker 1: 47:15

I think that it's going to be integrated natively into many of today's existing platforms that already exist, that have already won. And, like, it's just going to work. You're not even gonna think about it as a user because it's just going to be a natural acceleration of kind of what you're already trying to do. And, of course, it'll also open new doors. It'll open new verticals.

Speaker 1: 47:36

But I really think that that is still further away. And I think that, like, I'm a little bit more pessimistic on it opening completely new verticals than I was before. Just because, like, the rate of progress in AI has both been faster and slower than what I expected. I know that sounds like a contradiction, but what I mean is that, like, everyone who's not OpenAI has moved very, very quickly, but the absolute state of the art has not budged that much in the last, like, 16 months or so.

Speaker 2: 48:05

I've discussed this with a few people on the podcast, and I've had kind of mixed opinions. Do you think that's because it's become fundamentally harder to push the frontier and, like, breakthroughs have just been, like, less forthcoming than we thought? Or is it just that, you know, large model training runs take a long time, like, the models in the oven, it hasn't come out yet. And so, like, we might have large step changes in capabilities. Right?

Speaker 2: 48:30

Like, it could feel for, like, long periods of time, like, not much is happening because we're not actually, like, seeing that progress, and then a new model gets released. Right? Between text da Vinci 2, g p t three coming out, the model which you said was, like, effectively useless for answering questions and g p d four, from an outside observer, not much happened. Right? Like, that's almost a year's gap or, like, more.

Speaker 2: 48:53

And if you weren't sitting inside open AI, like, it was easy to have no clue that, like, how much progress was being made. And then g p d 4 came out, and you said yourself that you guys had a kind of a red alert moment and switched what model you were working on. Do you think that that could be what's going on here, or why do you rule out that hypothesis?

Speaker 1: 49:10

I think it's kind of very difficult to be, like, on the outside and speculate about, you know, what's going on the inside, as fun as it may be. But I think at the end of the day, the question comes down to, like, how much capability is there left to be unlocked in these models? I think that's the fundamental question. So, you know, you brought up text da Vinci 2 to chat gptgpt3.5. The insight there was that these models were fundamentally capable.

Speaker 1: 49:33

It just needed, like, in retrospect, a relatively basic unlocking step to make that ingrained information useful.

Speaker 2: 49:40

So sort of GPT 3 to 3.5, you know, from from text da Vinci 2 to chat gpt, they figured out how to do instruction tuning and reinforcement learning from human feedback. And you're saying that was the real unlock more so than it was scale and capabilities gain. And so if you buy that hypothesis that the main difference was the instruction tuning and not the capabilities of the base model, then you might sort of buy the argument that actually, like, unless there's not a change of that magnitude coming, potentially.

Speaker 1: 50:11

Right. I mean, it's really an open question. Like, I don't know. So for example, GPT-four, right, basically scaled up this instruction tuning approach to really what was at the time the maximum. I think my hypothesis is they chose that model size based on the largest model that will be runnable in, like, a production environment and, like, tolerable by users.

Speaker 1: 50:32

So, you know, something approximately, like, 15 to 20 tokens a second. Fast forward to today, I think one of the most interesting breakthroughs that we've seen recently is the interpret interpretability paper from Anthropic, the one where they talk about Golden Gate clod and how, like, they were able to basically figure out what various features inside the model do and clamp them in such a way to make them more reliable. So I think, like, the breakthrough that I'm most excited about, that I think that we're going to see in the next generation of models is training for correctness as opposed to training solely for, like, next token prediction. So training for, like, process correctness, for generating correct code, for generating correct reasoning, for doing correct math, I think that will be very interesting. It'll be very helpful.

Speaker 1: 51:21

What I'm less bullish about is, like, will that really allow us to unlock, like, a whole range of verticals that are not possible today? I'm not sure. I think that what's more likely is that it will just enable products in verticals that already exist to be significantly more effective for their end users. And I'm concerned that we're going to hit a real, like, diminishing return situation where, like, we fully unlock transformers, and the rate at which we can make them bigger is really kind of hardware dependent on, like, how quickly we can figure out lower precisions. There's all sorts of optimistic stuff here.

Speaker 1: 52:00

Like the paper that showed that basic, like, one bit precision for models is possible without degrading them too significantly. But I don't know if it's gonna fundamentally, like, make the model go from, like, being able to write a program when given specific instructions to invent this thing with creativity and be kind of like this creative tool as a service. Like, I don't know. I don't know if, like, we'll get there without another significant architectural leap.

Speaker 2: 52:29

Yeah. I'm I'm surprised by or maybe I'm not surprised, but but essentially, I think what you're saying is that overall, you don't think that the impact of AI in society is gonna be that big. And that actually, like, roughly speaking, we've got most of what we're gonna get from this current wave, and it's just to be in a question like it being distributed more evenly. Because if I ask you how the world looks 5 years from now, it sounds like you're saying more or less the same, but with things a bit more ubiquitous.

Speaker 1: 52:54

I don't think that's what I meant. Like, I think that there's gonna be radically new products and services in existing verticals that are dramatically successful and that help people say code 10 times better, a 100 times better, how people do graphic design, 10 times better, a 100 times better. And I think that that will help people be more efficient. Yes. But I don't think, like, it'll

Speaker 2: 53:14

But it feels it feels to me like maybe you're, like, ignoring all of the second order effects. Right? It's hard for me to believe like, basically, it sound like, what you're saying seems inconsistent to me. It's hard for me to believe the the two statements that you've made kind of simultaneously. 1 is which is, like, engineers could be a 100 times more efficient, as an example, and not much else changes.

Speaker 2: 53:34

Like, the second order effects of making software extraordinarily cheap, or the second order effects of, you know, having some of these, you know, making making lawyers effectively, like, accessible to anyone and super cheap will surely be large. Right? Like, if those things are true, then I don't see how society couldn't look wildly different 5 years from now. And so if you say to me, as you already did earlier, I think it'll look largely the same, but with, like, AI products a bit more ubiquitous. Like, chat gpt being baked into everything, like, that kind of world.

Speaker 2: 54:05

I view that as a world without significant change, really.

Speaker 1: 54:08

I don't I don't yeah. I I I never meant to make a statement about the magnitude of change or the, like, the second order effects, which I'm sure will be massive. Like, I think what my biggest point there was that, like, your average person, I think, will experience the benefits of AI without paying too much attention to the fact that it was AI. I think that's my point. Like, my point is that, like, AI will be seamlessly integrated into many facets of daily life that for sure I think will have profound effects.

Speaker 1: 54:32

But I think, like, for the average person, like, it will just kind of happen, and it will feel normal. And so this is my point. Like, you know, living in San Francisco here, I've taken I looked at my Waymo app the other day. I've taken, like, 60 Waymo rides now. And, like, the first couple of times, this was, like, the craziest thing that I've ever experienced.

Speaker 1: 54:50

And now I'll just use it as a practical transportation mechanism, And, like, it just feels normal. And so I think my bigger point here is that, yeah, I definitely think society will be transformed in many ways. I think people become vastly more productive. I think it'll be a lot easier to create businesses. I think there's gonna be a lot more creative output.

Speaker 1: 55:07

GDP will increase dramatically. That will all happen. But the craziest thing about all of that is that, like, your average person will still think about their lives, I think, the same way we do today, which is like, oh, you know, my life is cool. Like, there's things that are going well. The things are not going well.

Speaker 1: 55:23

Yeah. I I don't think that it'll change most people's thinking on a day to day basis. And so I think that's, like, the very important kind of point that I want to, like, kind of drive home, which is like and that's almost a testament to its success rather than its failure. Like, it's going to be integrated so seamlessly that, like, it will feel magical at first, and then it'll all just feel totally normal.

Speaker 2: 55:42

Alright, Michael. Well, it's been a pleasure chatting to you and, looking forward to doing it again in the future.

Speaker 1: 55:48

Thank you. Thank you for having me. This is great.

Speaker 2: 55:51

Alright. That's it for today's conversation on high agency. I'm Reza Habib, and I hope you enjoyed our conversation. If you did enjoy the episode, please take a moment to rate and review us on your favorite podcast platform, like Spotify, Apple Podcasts, or wherever you listen, and subscribe. It really helps us reach more AI builders like you.

Speaker 2: 56:08

For extras, show notes, and more episodes by agency, check out humanloop.com/podcast. If today's conversation sparked any new ideas or insights, I'd really love to hear from you. Your feedback means a lot and helps us create the content that matters most to you. Email me at razahumaloop.com or find me at razrazpro.

More episodes

Chapters

What is High Agency: The Podcast for AI Builders?