Chain of Thought | AI Agents, Infrastructure & Engineering | The AI Framework Era Is Over: Why Context Is the Moat

Jerry Liu built LlamaIndex into one of the most installed AI frameworks of the last three years, then bet the company that the framework era is over. He explains why context quality is the moat that survives as agent loops get good enough. Chain of Thought is hosted by Conor Bronsdon.

Show Notes

Jerry Liu built one of the most installed pieces of AI plumbing of the last three years. LlamaIndex became the indexing and retrieval layer a whole generation of RAG apps were stitched together with. Then he started arguing that the framework era he helped create is over.

Jerry is co-founder and CEO of LlamaIndex. In this conversation he walks through the company's pivot from open-source framework to managed document infrastructure with LlamaCloud and LlamaParse, and why he is betting that context quality is the one moat that compounds as agent loops get good enough to absorb the scaffolding.

If you are a founder worried a frontier lab or a coding agent is about to eat your product, this is the playbook for reinventing your ICP without losing the thread.

In this conversation:

Why Jerry says the AI framework era is over, and what actually survives
How agent harnesses like Claude Code collapsed the old framework patterns into the model
Why context quality is the durable moat, not the agent loop
How LlamaParse beats legacy OCR and frontier models on document accuracy and cost
Why 95%+ accuracy is the real bar for legal, insurance, and financial document work
How LlamaIndex disrupted its own product and reinvented its ICP to stay alive
Jerry's take on agent memory, model personalities, and why LLMs are still bad writers

(0:00) Is the AI framework era over?
(1:56) What died and what survived
(6:31) Why context quality is the moat
(8:12) Defining the context layer
(13:18) Coding and vision as the abstraction layer
(18:13) The bet that context compounds
(23:59) Which verticals are adopting
(25:14) Why 95%+ accuracy is the real bar
(29:49) The file system as an agent primitive
(34:33) Surviving your own pivot
(37:15) Reinventing strategy and hiring
(42:00) Agent memory as persistent context
(44:41) Model personalities and cultural memory
(47:51) Writing with AI
(50:19) Closing thoughts

Connect with Jerry Liu:

LinkedIn: https://www.linkedin.com/in/jerry-liu-64390071/
Twitter/X: https://x.com/jerryjliu0
LlamaIndex: https://www.llamaindex.ai
LlamaIndex careers: https://www.llamaindex.ai/careers

Connect with Chain of Thought host Conor Bronsdon:

Newsletter: https://newsletter.chainofthought.show/
Twitter/X: https://x.com/ConorBronsdon
LinkedIn: https://www.linkedin.com/in/conorbronsdon/
YouTube: https://www.youtube.com/@ConorBronsdon

More episodes: https://chainofthought.show

Creators and Guests

Host

Conor Bronsdon

Creator and Host of the Chain of Thought Podcast | Technical Ecosystem Lead at Modular

What is Chain of Thought | AI Agents, Infrastructure & Engineering?

AI is reshaping infrastructure, strategy, and entire industries. Host Conor Bronsdon talks to the engineers, founders, and researchers building breakthrough AI systems about what it actually takes to ship AI in production, where the opportunities lie, and how leaders should think about the strategic bets ahead.

Chain of Thought translates technical depth into actionable insights for builders and decision-makers. New episodes weekly.

Conor Bronsdon is an angel investor in AI and dev tools, Technical Ecosystem Lead at Modular, and previously led growth at AI startups Galileo and LinearB.

Disclaimer: All views, opinions and statements expressed on this account are solely my own and are made in my personal capacity. They do not reflect, and should not be construed as reflecting, the views, positions, or policies of my employer. This account is not affiliated with, authorized by, or endorsed by my employer in any way.

[0:18] Conor Bronsdon:
Is the AI framework era over? Let's talk about it. Welcome back to Chain of Thought, everyone. I am your host, Connor Bronsdon. My guest today has argued publicly earlier this year that the AI framework era is over, that the scaffolding layer that he actually helped build is collapsing into the model itself, and that context quality is the durable moat once agent loops get good enough. In fact, he runs a company that's betting on that thesis. I'm delighted to have Jerry Liu on the show today. Jerry is co-founder and CEO of Llama Index. He built one of the most installed pieces of AI plumbing of the last three years, the indexing and retrieval layer a generation of reg apps were stitched together with, and then took Llama Index into a new era by pivoting the company into managed document infrastructure with Llama Cloud and LlamaParse. He's made a really quite aggressive bet that the framework layer he helped popularize is the wrong place to play long term. You've probably seen him at a conference giving a fantastic talk. You've likely seen him on Twitter slash X. Jerry, great to have you on Chain of Thought. Welcome. And where are you joining us from?

[1:26] Jerry Liu:
Yeah, thanks for having me, Connor. I'm joining from, you know, beautiful San Francisco.

[1:31] Conor Bronsdon:
It is always a pleasure to see you when I'm in the Bay Area. And I feel like I often just run into you at a conference too. You are so busy these days with the success of Lama Index. And one of the ways that you have made the company so successful is with this aggressive bet that what you built originally is no longer the approach. The AI framework era is over, as you put it. What specifically died and what survives?

[1:56] Jerry Liu:
Yeah, so maybe there's probably a little bit of nuances to that statement too, and I can kind of help clarify it to the audience. But just to kind of trace through an evolution of the company, you know, we started off as kind of a core set of abstractions in open source land to help like developers back in 2023 build these initial applications over large language models. And at the time, you know, none of the patterns were fully formed yet. You know, people didn't really quite know what agents were. People were just starting to get the hang of this idea of retrieval augmented generation, or RAG, or feeding your private context into a model. And so I think basically when there's a space where things are still being figured out, and then there's kind of a lot of flexibility in the end application, and that's where there's kind of value in a framework layer because the framework provides you know the core abstractions to make it easier to build certain things and if you think about like life or just like software in general it's basically just a series of like abstractions layered on top of each other and so we were kind of just the open source abstractions at the base layer around the LLM that made it easier to both connect our data use different types of LLMs and then also experiment with different techniques like you know, retrieval, being able to do some sort of like query rewriting, and then eventually some sort of agentic reasoning harness. So you can plug in like tool calls and that type of stuff. And so I think basically, the reality is over the past three years, this space has obviously evolved quite a bit, the models themselves have gone exponentially better. But what that also meant is, you know, the kind of harnesses and the agentic applications themselves have also gone a lot better. And the core architecture and patterns around certain types of agents have solidified. And so it's not like, you know, frameworks in the abstract sense are kind of like, no longer relevant, like universally, you know, obviously, there's still web frameworks, there's still a bunch of like frameworks in general, that even like, today, like agents want to be able to use like certain types of abstractions on top of others, so they don't have to like rewrite everything from scratch. Um, but the reality is like a lot of the old patterns of just like, you know, really trying to figure out the different internals and mechanics of an agent, um, are less relevant today because a lot of those have, uh, solidified into these like general agent harnesses. If you look at Clawed Code, OpenClaw, you know, Manus, um, kind of any sort of like deep research agent out there. they all kind of start following general patterns where you start having abstractions appear at a higher level than at the code layer. So whereas at the code layer, you might need to import Python classes that wrap different LLM providers, at a higher level now it becomes how do you define a natural language set of skills, an MCP tool, how do you program this engine that's already running? And so there certainly is room for a set of abstractions at this layer too. And there are plenty of companies building stuff in this space, whether it's OpenCode or these protocols, being able to help support this new age of agent programming. I think for us personally as a company, we did start off at the framework layer, but fundamentally I think our mission was always connecting data without webs. And so I think throughout the years, you know, we could have operated, I guess, continued to operate at the framework layer as it went up in abstraction. But I think for us, we decided to really identify and kind of solve an opportunity where we realized even as the AI agents were getting better, the context layer itself was not solved. A lot of folks were uploading unstructured document-based data into these systems, and a lot of that data was just not represented the right way. And a lot of the legacy kind of OCR technology themselves, you know, was not actually doing a good job at reading information out of these pages. And so we kind of basically narrowed our focus as a framework to kind of becoming core infrastructure to help provide, you know, high quality context to these evolving AI agents, which is, yeah, a little bit different than being at the framework layer. But we do think, especially in our position right now, is something that should be durable even as these agents improve. Now, of course, there's still plenty of space for other types of companies around the stack. A lot of folks are building various types of contexts that plug in through MCP or skills, systems of record, that's still a thing. As I mentioned, there's still higher level abstractions and frameworks. It's just those are a little bit different than what we started with.

[6:31] Conor Bronsdon:
Yeah, I love this idea of context quality as a moat because increasingly it has become like a watchword for folks this last really like six months in particular, I feel like, where the push to have great context has just exploded as coding agents have gotten better and better. Data has become, you know, we've known it's been differentiated for a long time, but it is just becoming so clear that to get the most out of, you know, the next frontier model, being able to fuel it, you need to have great documents, great data for it. And I look at the way we are approaching enterprise AI today and how much data is still locked away in silos, and both from a tribal knowledge perspective, but very much so from a document perspective. And it seems incredibly clear that this pivot or focusing, as you put it, has a ton of value for enterprise in particular, as they say, okay, how can I get all of the massive corpus of documents and data that we've established over the maybe decades that we've been in existence into extremely useful, high-quality content that can be fed to our LOMs of choice, whether that's, you know, an open model or whether that's, you know, a frontier closed model. So I would wonder from your perspective, have you seen a behavior change that either drove your decision to make this pivot for LOM Index or is now driving this kind of next phase of the company?

[8:12] Jerry Liu:
Yeah, it's it's a good question. I think the context need has kind of always been present, even since when we started the framework. Part of the reason the framework itself got popular is because, you know, literally, you know, you had all these large language models, they had a 4000 token context window. I think the first thing a lot of builders trying to do is, you know, how do I just get this understand my company knowledge base? And obviously, the company knowledge base was not going to fit within the 4000 context window, whether it was your Notion database, you know, your kind of set of Jira tickets, you know, your Salesforce, that type of stuff, like it just wasn't going to fit. And so how do you figure out patterns to kind of inject all ones with the right set of information at any given time? which gave rise to these techniques of RAG in 2023. And so I think even from that point onwards, there is some, almost like by definition, need to kind of provide context to an LLM at the time in order to make it actually do stuff. And I think as the AI space has evolved, like the LLM part evolved into some like general agentic harness, you know, there's Clod, Cowork, ClodCode, OpenClod, Manus, like some of these like general agents implementations that you can basically use from a third party provider. Um, and then, you know, uh, there's all the things you can plug into the agent harness. So MCP tools, skills, um, the actual task that you want to define. Um, by definition, this is almost like the, this is the context layer, right? And the Frontier Labs are building these incredible generalized reasoning capabilities, um, that, uh, need access to like external services, tools, and context to actually do things. And so when I think of the context layer, it's literally anything surrounding the model, like the set of services that it has access to in order to do things. And this extends, this could be anything, like it could be an existing piece of software that the agent is able to use. Like if it has MCB connectors to Confluence or Salesforce, you know, those are providers of context. That's why I think people think this idea of a system of record is still going to be around because agents need a way to actually store, act, and operate on data. And then, you know, there's like web context, being able to actually crawl the internet and being able to efficiently search for things so that the agent has access, like live information. There's structured data, you know, stuff locked up within SQL databases. And for us, it's like document context, like stuff that's locked up within, you know, these unstructured document containers within your file system in the form of like PDFs or PowerPoints that need advanced technology like new age OCR to actually read information off the page. And so I think for any builder or basically any software company today, one of the boats is actually just being a provider of like tools and context to the AI agent. Or maybe to put it another way, I think a lot of software companies are trying to figure out how to make their software and services more agent native and also based on consumption. And so you know instead humans using your sassful or humans you know reading your documents or humans doing you know searches in google you now have agents actually going in and you know calling a ton of mcp tools and writing scripts and doing these calls and a lot of the world is basically redesigning these patterns to make it suitable so that agents can like launch massive volumes of queries through any piece of software to both like get information and also do things Um, and so I think that's, uh, kind of how it's evolved over the years, but even from the beginning, like almost like, again, by definition, like you need like kind of this idea of specifying some sort of task plus like additional data for, um, AI to do things. Um, and I think the, like functionality of that has only expanded over the years as AI agents have gotten more advanced.

[12:02] Conor Bronsdon: [OVERLAP]
Yeah, the fact that even Mercury for banking has a CLI now, I think really

[12:07] Jerry Liu: [OVERLAP]
Yeah.

[12:07] Conor Bronsdon: [OVERLAP]
says

[12:07] Jerry Liu: [OVERLAP]
Yeah.

[12:08] Conor Bronsdon: [OVERLAP]
a lot about what we're trying to provide. We've

[12:11] Jerry Liu: [OVERLAP]
Ethan.

[12:13] Conor Bronsdon:
not to say that developer experience has gone away, but agent experience is as important. If not, I would argue much more important, especially not just now, but in the months ahead. we need to be providing all this information to agents to make them special. And part of this is because, you know, native function calling has gotten so good, construction following has gotten much better, agent loops are at least good enough, and the models are just so much smarter than they were two years ago. I think you used to be able to see this major gulf between search and frontier models, um, And particularly since, you know, DeepSeek really shot onto the scene, but I would look at Kimmy and others where open models are not far behind. And now I should caveat, I haven't gotten access to mythos yet. So maybe there is this massive leap coming that I'm not seeing. But right now it feels like a lot of the differentiation that you're seeing between companies is how effectively are they able to give an LLM data and context and tools and provide it the right governance to just go and run essentially.

[13:18] Jerry Liu:
Yeah, yeah, exactly. I think basically all the generalized frontier labs are converging on kind of similar ideas, with like some slight implementation differences. And the idea is, you know, they're all kind of settling on this idea of coding as the first layer of abstraction that they're just getting down really good at helping to automate. And in the process, because coding is basically a proxy for computer use, then you can kind of use some of the techniques and capabilities there and use that to basically automate any type of knowledge work, not just software engineering. That is why interfaces today are so popular because they're basically tuned for these like coding agents are really good at writing bash and writing code and actually just enabling them to do that to basically operate any interface. So I think the frontier models honestly are probably all going to start like they're all going after kind of like the same type of things. You know maybe one day it's like Opus is better at like financial knowledge work, maybe GPT 5.5 is like a little bit better at certain types of coding. they're all kind of going after it. And I think open weight models are catching up. I mean, I think it's kind of impressive to see kind of the diversity of like parameter sizes are basically almost like simulating what, you know, Opus 4 was, you know, like a few months ago or half a year ago with like a smaller parameter size model. So the advancements are great. I think for everybody else in the space, it just means think about like any task that like you might have difficulty like solving now in a generalized fashion. Um, and like basically instead of like overengineering your stack, if you just like wait six months, um, with a more general agent harness, like it will probably get better and be able to solve the task. Assuming you actually just like specified the parameters on the task, right.

[15:01] Conor Bronsdon: [OVERLAP]
Yeah, I totally agree. And I think your point about Frontier Labs in particular solidifying around coding as this first abstraction layer is so apt. A, it's just a great way to communicate with computers, which enables so much, particularly as it becomes so much cheaper to automate and we now can do differentiated automations, not just linear ones. But I, I would look back to a conversation I had with the poolside co-founders on the podcast. I've got a little over a year ago now, something like that, where they

[15:31] Jerry Liu: [OVERLAP]
you

[15:31] Conor Bronsdon: [OVERLAP]
talked about this idea of coding is the way to AGI. And, you know, that's a, it's a bit of a hype phrase, but I think we're really seeing that come to fruition where it's like, Oh yeah, everything is converging around being excellent at coding and then being able to automate tasks because of that. And we're just seeing that push forward.

[15:51] Jerry Liu:
Yeah, I think I think it's definitely a proxy. I think it's because the way your computer is designed is that, you know, if you like code and program around it, it basically gets you to most of the way of what you want anyways, because there's like the command line, you can like write scripts, you can execute tasks, like every service has like an API that you can just call. And you know, like, with coding, you get a lot of the way there. I think Yeah, the next part is probably just like vision capabilities. I think with advanced vision capabilities, you can start looking at things that you know not just at the code or like text level but um you know actually be able to take like an entire computer screen and just like operate it at a visual level so you can know what to click and you know know what you're looking at zoom in on certain things um i think that's probably the next step that every frontier lab is is going for is just like combining uh coding which is an efficient way of like you know writing automation and operating over interfaces with just like vision um and even like operating over video streams um so they can just like more intelligently take actions

[16:48] Conor Bronsdon:
Yeah, it's interesting to see this all converging as well with robotics and the approaches there of, OK, how can we now get these to take more intelligent actions in the real world? It's going to be fascinating to see how this all develops the next few months.

[17:03] Jerry Liu: [OVERLAP]
There's a

[17:03] Conor Bronsdon: [OVERLAP]
But in particular, as you mentioned, the way to make this effective, whether that's trying to have a million door dashers where a camera so that your robots can get better at delivering food, whether that's getting as much excellent code or bad code that I wrote back in the day into your model for training so you can be great at coding or whether it's trying to get your particular business to or your particular set of agents to understand your business through the document context and other context that you establish over the years. It all comes back to, you know, training data and context and I think this is a bit of a simplification, but I would argue that, you know, we can kind of look at context just like a crucial type of data. It's almost silly for us sometimes to not conflate the two terms. Like obviously there are reasons not to, but like anytime you say data, you could also think of data as just like context for a model in the right situation. And you've made this bet that context quality is what's going to compound over the next couple of years. Can you talk to me about why you're so convinced this is going to be durable instead of just another commoditized step?

[18:13] Jerry Liu:
Yeah, for sure. Well, you know, I can talk a little bit about maybe like documents in particular, but you know, start with the overall space. I think in general, you know, if you think about the context layer, I think, as I mentioned, the overall like agent harness is kind of a blank slate. Like this, the Frontier Labs will give you, you know, you have access to cloud co-work, you have access to like cloud code, like there's no kind of starting thing. Like you basically have to put in the instructions to actually get it to do things. Um, and I think, like, once you start, like, operating with that kind of mindset, then, you know, there's certain types of tasks where you have to give it a lot of instructions by definition for it to actually do things well. Because not only is it specified, can you specify that task in, like, a single sentence of natural language, you now have to give it a lot of context. Like, for instance, if I'm trying to, um, you know, like, prepare for a customer call that's upcoming on the calendar, Um, it's not like I can just say, I need you to prepare for my customer call. I need to actually feed it the, you know, previous 5 to 10 transcripts on the customer, um, actually look into kind of the email threads of what the recent conversation history has been. And so there are just like tasks that by definition need like a lookup of kind of external sources of data. And, um, you know, you kind of need ways of actually providing that data for, uh, the task to kind of be more well specified so you can actually solve it. Um, and I think the way I think about vertical AI, which we're not doing, but you know, a lot of these vertical AI companies, um, is basically packaging contacts in a way that's like a little bit easier, um, to kind of, uh, solve certain tasks instead of relying on users to do it themselves. You know, I think Claude Coburg is such a blank slate and I'm pretty sure if you equipped it with the right, like tools and MCPs and skills, um, you could basically use it to do anything, um, that like is possible within the current, like, you know, um, realm of possibility of like AI agents but it's just certain things will require you to put in like you know a lot of work like you would need to write and really tune like some giant skills file plus like MCP tools for it to solve like some sort of knowledge task and you know the reality is most people either don't have time or they don't know how to or some combination of both and Um, especially if you look at all these like AI solutions in certain verticals, like whether it's financial research or kind of healthcare, I mean, who knows, like, you know, anthropics kind of building some of these tools too. But basically, you know, like by providing a tailored interface to solve a task, it just makes it way easier to kind of actually do the thing with AI versus like relying on you actually having to program this like blank slate to actually do everything. Um, and I think that's kind of where the value of a lot of like vertical AI provides. is I almost think of vertical AI as a form of context to help solve more domain-specific tasks. Now, kind of going into a little bit what we do, because context itself, I think, is super general. It's just literally any set of inputs you provide to the model, including both inputs plus task plus external data. We're specifically interested in document context and turning a PDF into markdown or some extracted form of information that's like easy and accurate for the agents to understand. I think the way that we get compared is against both kind of legacy OCR tools. There's like 25 plus years of document OCR tools that their entire job was to convert PDFs to like markdown or some sort of text. And the other alternative is you feed an image of like the screenshot directly into the frontier model. and then basically just use the frontier model as an OCR tool. So I think that's kind of... uh like that that's kind of the basis of comparison here and i think for us um we kind of see a lot of opportunity in building something that's just like extremely high accuracy um at low cost if we just like focus on both like very fine-tuned models plus like agentic harness techniques um only on document understanding um i think the reality with all the frontier models um All the frontier models are getting better at like general visual understanding. You look at Gemini, Opus, GPT, every new release, they're starting to climb the benchmarks on like general visual understanding, but they're also just tuned for coding, reasoning, and a lot of other tasks. And especially if you're trying to like unlock context or just like parse a million PDFs, you really need something that's both extremely high accuracy, but also really cheap. And I think the way the frontier models are tuned, they're just not incentivized to actually do that because they're tuned for like high intelligence tasks. And so what we basically want to provide is, or the opportunity that we see is providing just really deep tech to both like parse and extract documents with extremely, extremely high accuracy, low cost, plus like being able to provide relevant metadata annotations and tags on these documents. that then provide these AI agents access to almost like audit trails back to the source document. If you can trace back to the exact words and draw a box around the words that a legal contract came from when an AI agent generates a response, that's a very powerful tool in a lot of vertical AI stacks, whether it's a big company or a small company. Um, and so we kind of think of ourselves as building like just a general toolkit plus all the surrounding metadata. That's also really high accuracy, uh, for AI agents to access like any pool of documents. And, um, for us, like hopefully it should be, uh, enough of remote that, um, is differentiated from like users trying to do it with frontier models themselves.

[23:59] Conor Bronsdon:
Are you seeing particular verticals adopt your stack more aggressively?

[24:06] Jerry Liu:
Yeah, for sure. You know, I think for us, um, the typical verticals that we go after, uh, reflect industries that have a lot of their contacts locked up in documents. Um, and I totally acknowledge, like there are a lot of companies that honestly, like they don't have a lot of documents. Um, if your entire, you know, um, a stack is basically dependent on my code, for instance. We don't typically index code bases, that's typically the domain of my coding agents. But within legal, within financial services, within insurance, within manufacturing, healthcare, education, government, And then basically the tech versions of all these. There are a lot of use cases where they're just like either humans, or just a lot of like inherent knowledge bases that contain, you know, like millions if not more of like just unstructured document based data. And this is kind of like the uh, the basis of a lot of projects that if you go into these companies, um, they're really trying to automate a lot of workflows that typically humans used to do and using kind of AI to actually understand all the information within these docs, extract out the right information and also take actions.

[25:14] Conor Bronsdon:
And I imagine accuracy becomes a major differentiator, especially as you're working with regulated industries that are touching customer funds.

[25:23] Jerry Liu:
Yeah, that's actually super important. And the reason why I think it's also hard to just like DIY your own stack with a frontier model. I think what, if you look at Opus 4.7, Jupyter 5.5, you know, they're like not bad at understanding like medium complexity documents, but it kind of gives a deception of like, it'll get you 80% accuracy over your document corpus. Well, there'll be like 20% of documents that uh just like it'll hallucinate some number um or uh you know it'll just like represent a table wrong um the issue in a lot of these use cases is that basically makes it below the accuracy bar for like what you can actually operate with because with 20% hallucination rate that means if you actually you know try to extract out information or answer questions over that data the downstream AI agent workflow is going to be disrupted um and kind of any sort of like straight through automation with agents um it makes it a little harder to trust, given the gap in accuracy. And so a lot of these require really high accuracy, 95% plus, especially in more sensitive use cases, with human-in-the-loop review on some of the extracted data.

[26:35] Conor Bronsdon:
So are you seeing a tension between agentic pipelines where the model decides what to extract and then deterministic pipelines that have more audibility? How is that functioning with these regulated enterprises?

[26:46] Jerry Liu:
Yeah, I think maybe just to reframe the question a little bit, I think there's kind of two use cases for using AI to automatically extract information from documents, which is basically what we do as a company. And so I know our kind of mission, our tagline was you know, you can kind of use our stuff to unlock context from documents and feed it to AI agents. One practical instantiation of that is you have a bunch of data that's in the form of documents, maybe in your Box, Dropbox, Microsoft, SharePoint. you want to create a knowledge base from those documents and put it into a vector database. That basically means that you use R capabilities to do that first pass OCR on top of the documents, convert it into Markdown, and then downstream operations become like chunking, embedding, putting into some storage system so you can actually search over it. So I think that's kind of like one of the use cases where you're structuring context in a way such that you can use it for any downstream agent, whether that agent is like simple, you know, kind of like a low sophistication agent or something that's like super advanced, like flood code. And I think that's a very common use case. Basically, any enterprise wants to create some agent ready knowledge base that can give some agent like the ability to do things. Now the second use case is kind of related but a little bit different in that there's a lot of these like human workflows that have existed already prior to AI agents where humans are basically just like reviewing massive piles of paperwork and then doing like manual data entry into systems. whether it's like, you know, onboarding, claims, invoices, and, you know, like KYC, that type of stuff. There's a lot of, like, existing software where it's really just a helper interface for humans to upload a bunch of documents, help, like, Automatt put flags on some values, and then enable them to, like, more easily do data entry and correct, like, wrong values. And I think the advantage of a lot of this technology, especially with what we have in comparison to legacy OCR tools, is it just like massively increases the potential to actually automate the extraction of all these documents with really high accuracy. It's a little bit less that story of like I'm gonna you know structure all these documents into a knowledge base and give it to an AI agent. It is a little bit more deterministic too because there's usually like business rules and flags you want to apply and then systems you want it to route to. But it's still like a super important use case because like a lot of these companies just like have a lot of folks just like kind of doing this task. That's like relatively repetitive. And I think if like they use slightly better technology to actually understand these documents and translate it into a digitalized representation, you both be like saving a lot of money and build a process a lot more volume. Um, that's actually pretty common in a lot of these industries too. Um, and that's where we're seeing some, you know, some kind of usage as well.

[29:49] Conor Bronsdon:
It's interesting because it seems like from what you're saying, if coding agents are basically the abstraction layer that models are centralizing around, and you are enabling this other primitive of the file system to be unlocked, is this telling us about what the future or I guess what does this tell you about the future of how agents are going to evolve based off of this like clear abstraction layer and you know crucial primitive?

[30:22] Jerry Liu:
So I'm not actually convinced the file system plus coding agents is going to be like the future, so to speak, like a year from now, a year and a half, two years from now. Honestly, I have no idea. So I think I'm going to leave that to kind of whoever wants like drop a prediction. I do think that's kind of the current state of the art because like coding agents are proxy for computer use and vision capabilities aren't quite there to help like you know these agents massively stream like real-time video yet and so this is the one of the closer representations towards just being able to automate a bunch of stuff on your computer. You know you combine that with now this abstraction of the file system is like fly ready and it's ready for coding agents to use and these agent harnesses are also very good at kind of using file systems to help like navigate existing documents you know like retrieve the right context and then like you know take actions over them. And so I think it's a very powerful abstraction right now that actually solves a good chunk of knowledge work. And I think where we come in is, again, we basically would be one of the core modules if there are document-based types of data to help unlock contacts from those containers into the file system. Going into the future though, honestly I have no idea. The thing is, there's probably going to be some new abstraction that's not just coding. All the vision stuff that I mentioned, there's probably going to be something there that allows them to more generally see the screen and do actions. There might be new types of operating systems that actually make it a little bit friendlier for agents to do things on top of. Yeah, but I think for me, I reason from first principles, even if the harnesses or even the core primitives change, going from file systems, CLI, and coding to something a little bit more vision-based, this concept of context, I think, still matters quite a bit. And I think for us, I feel reasonably confident in our position of being a durable layer, even as these agent patterns evolve.

[32:27] Conor Bronsdon:
Yeah, I was wondering about this when you brought up the idea of, oh, the reason why we're converging on code today is because that's the easiest way to just communicate with computers, essentially.

[32:39] Jerry Liu: [OVERLAP]
Yep.

[32:39] Conor Bronsdon: [OVERLAP]
And I was kind of going like, okay, well, we're already seeing changes in our hardware layer. You know, it's not just GPUs and CPUs anymore. You know, we've got TPUs, we've got XPUs, we're expanding. Are we going to see a total rethinking of what a computer is? I mean, we've already seen people talking about it. There's this whole idea of Neuralink, etc. Everyone's trying things with vision. I don't know, but it seems like we are at the onset of a new hardware wave too, though it may take a little longer, that is going to match agents maybe more specifically.

[33:17] Jerry Liu:
Yeah it's possible. I think there might be a new hardware wave and you would know more than I do on this in terms of just like kind of the underlying infra and hardware that powers some of these evolving models especially as some of the architectures like evolve and change and that type of stuff. I'm not actually sure that there will be like an entirely new software layer though for AI because the way it works right now with AI is like And part of the reason it's so powerful is instead of trying to reinvent every part of the software stack, the Frontier Labs just made it really easy to operate on top of existing tools everyone's already using. So they're technically not reinventing Windows or Mac or Photoshop or even a lot of these SaaS tools. The idea of MCP, the idea of skills, the idea of actually just being able to write code but import Python libraries is operating on top of existing abstractions that have already existed. And so I do think it's more likely that they're going to try to build something that runs on Mac or Windows, even in a generalized computer use way. And then later on, there might be a new agent-native operating system. And I'm sure Microsoft or something will come out with that, if they haven't already tried. But basically, I think the overall AI trend is to adapt with existing tools and kind of go from there.

[34:33] Conor Bronsdon: [OVERLAP]
Yeah, this brings to mind the framing of kind of this whole episode, which is, hey, Llama Index went from an open source framework to this big pivot. How did you do it successfully? Because I think there are a lot of folks listening or, you know, reading Twitter who are going, Uh, I'm a little worried my startup's going to get killed by cloud code. Uh, I'm a

[34:54] Jerry Liu: [OVERLAP]
Mm.

[34:55] Conor Bronsdon: [OVERLAP]
little worried that a frontier lab is going to demolish whatever I'm doing because they simply start to do it better. They are able to automate it away. How have you made Llama Index so durable?

[35:06] Jerry Liu: [OVERLAP]
It's a good question. I mean, I wouldn't say we're in the clear yet, I'm going to be totally honest. I mean, we're

[35:10] Conor Bronsdon: [OVERLAP]
Fair.

[35:11] Jerry Liu: [OVERLAP]
not like,

[35:11] Conor Bronsdon:
Yeah.

[35:11] Jerry Liu:
again, we're not Microsoft. I do think like at this stage, literally anything could happen. Like, you know, you look at companies are even much more scale than we are. And I'm pretty sure they're also having constant like existential crises every time there's kind of some some like new model release coming out. I do think that what we have done has put us in a bit more of a durable position, compared to what we were like three years ago. And I think, to be honest, it's not like, you know, for any potential founders, like, it's, it's not like an easy process, you like are fundamentally kind of thinking about how to disrupt your own product, your own product strategy. And then also kind of maybe almost like move the audience of who you're serving. And there's going to be a core group that remains to kind of like some other group that maybe is like a little bit more adjacent to what you were originally serving. And so I think just like all these questions, you know, it's not an easy task. But I do think, you know, one thing that's interesting is you listen to all these podcasts of folks with companies that are like much bigger than where we are today. And they're all kind of like wondering the same things. Like, I think even these like very seasoned business leaders at public SaaS companies are trying to figure out how to disrupt their business or figure out how to like completely reinvent their product. And so I guess the only thing I'll say is, you know, you're not the only one. And for pretty much like companies that are a thousand times bigger than you are also trying to do the same thing. Um, and that's harder than you trying to do it yourself. And so I think that's one of the things I'll say. In general, I think, especially in this AI landscape, everybody has to be extremely on their toes and just be willing to reinvent their ICP. And I know that has some potential issues for continuity. Um, but you know, at the same time, if you are able to do it, um, and even for like some period of time, it helps you establish your foothold and, uh, just be able to like serve some like emerging need in the AI market. And the benefit is because like everybody's so interested in AI, when something does land, um, typically grows pretty quickly.

[37:15] Conor Bronsdon:
How did you have to alter your own, I guess, internal company strategy as you made this pivot? Did you have to change your hiring processes or have you changed them in other ways due to what's going on?

[37:28] Jerry Liu:
Oh, yeah. I mean, I mean, I think, I mean, this entire journey and, you know, luckily, I think we hired kind of like some some core folks that, you know, are are like, really invested. And just like this, this overall North Star that we had, actually, I don't think it's changed too much, even if the product surface and you know, the ICP itself has changed. The North Star was always like figuring out how to get data into LLMs. I mean, I think that basically was how the company started. And to some extent, that's pretty much like still what we are doing, trying to figure out how to get data into LLMs, just maybe operating out like a little bit different of a layer. I think people understood that mission. And I think part of the goal was to understand like, okay, what are the best tools actually serve that mission? I think in the beginning it was a framework because people just didn't really have anything really settled down. So the best way to build those tools was something that just made it easier for folks to build like various types of apps and connect their data in various ways. I think as those patterns emerged, it became clear One of

[38:28] Jerry Liu:
the valuable pieces you could provide is basically just providing really deep tech around a certain piece of context. Because the abstractions were solidifying, but the need for high-quality production context was still evolving. Um, just communicating that to the team, um, and just accept that there's going to be friction. And then yes, like hiring will obviously change depending on what you actually build. You know, we need like kind of core applied AI researchers. Um, if you're interested, let me know, um, to actually kind of work on the frontier of document understanding, um, capabilities and actually, you know, like do like deep tech, like ML, not just AI engineering, like actual ML, right? Like, you know, model tuning, training, that type of stuff. Um, and so, you know, I think when you're a startup, you just have to adapt.

[39:11] Conor Bronsdon:
Jerry, now's when I have to ask you to drop your email or best way for folks to get in touch if they are listening and they do want to apply. Where should they go?

[39:19] Jerry Liu:
Yeah, for sure. There is llamaindex.ai slash careers. If you're interested, you know, we're looking for a variety of roles. I think at the time of whenever this comes out, I'm sure the roles may have changed too, because, you know, there's always kind of expanded needs across like go to market and also edge. I think the other piece is, you know, we're pretty active on socials, so we're pretty active on LinkedIn. You can follow us on Llama Index and also on Twitter, you know, and I have a Twitter account too.

[39:46] Conor Bronsdon:
Some folks may have seen you on Twitter, that is for sure. Jerry is fairly popular there, to say the least. And I will co-sign here and say I know a lot of incredible folks at Lahman Index, so it seems like it's a cool place to work, though I haven't actually done it myself. So definitely check that out if there are any roles that are of interest to you. I want to bring it back to something you said before we derailed on hiring a little more. And it's really poignant, but also I'm finding it mildly concerning as I think more about it, which is this idea of, you know, the whole job of Lama Index is getting data into LMs. And it's making me think about What a high percentage of my own job today is just that problem. So I guess really it just brings to mind this point that you made earlier of like, look, this may not be durable forever, but What we're doing today is really valuable because I mean, so much of the value creation happening at companies around the world today is that exact problem of how do we effectively get data into LLMs? How do we make sure it's accurate? How do we just enable these models to help us run faster? If we all think that velocity and throughput is rapidly increasing through LLMs, that we need to do the same to keep up whether it's in our own jobs or for our companies. I mean, that's a very valuable place to be.

[41:07] Jerry Liu:
Yeah, yeah, for sure. I mean, I think, you know, there's this existential question of just like, how much are we just an L1 in general? But yes, like also, I think as I do think as humans, we basically are kind of responsible for at least like defining the task specifications and understanding it's like similar to eventually like how you would communicate to another human from a junior to eventually a senior employee at their function and be able to give them enough like context to get on board and get started. And that includes all the data that might be present that you don't explicitly communicate, but that they have to go and research themselves. And so they would go in, look at your internal company knowledge base, look at your transcripts, look at your calendar, email, and be able to synthesize stuff. And that's basically what's happening with Outlumps today. You give a task specification, they look at your context, they go do things, and this only gets better as the models get better.

[42:00] Conor Bronsdon:
We've talked a lot about context today, but I'm curious to get your take on agent memory, which is something that folks are starting to increasingly talk about. We had a great episode with Rich Mandelake on it that I totally recommend folks check out. What's your take on this and how memory and context should be interacting?

[42:17] Jerry Liu:
Yeah, I think memory is basically just a form of I guess like persistent context. I kind of think of it as like, yeah, kind of, I'm still kind of forming my opinions about this. I think there's different types of memory. I mean, on one hand, there's just like, this idea of a system of record, like entries that you update in a database, just because you need like, a table of things, a store of things. I mean, just think about if you make a new entry in a CRM or you make a new entry in a Notion, right? That is a form of memory because you're updating your context within some system or record and then with the ability to store and retrieve from it. And I think there's a lot of ideas around like how agents update their state that basically revolve around that. It could be updating like an actual database, updating a SaaS tool. And with, you know, Karpathy's tweet, it was around like being able to build and update like your own internal wiki, which is basically a system of record on your file system. I mean, I think that's kind of like just a general applicable practice that literally anyone using a coding agent can do today, you know, kind of build some internal knowledge base of just like a set of files that you can lock up later. Oftentimes I find, especially for non-technical people or people that are a little bit too lazy to code, like myself these days sometimes, the simpler the abstraction, the better. And so this concept of having memory just be a pool of files on your file system is kind of nice to think about because it's pretty easy to represent and store. I think there are certainly more complicated forms of memory. There's ways of being able to actually synthesize some sort of graph from your memory or something, form entity relationships, be able to synthesize summaries. I personally have not spent a ton of time building those systems. I'm sure there are folks that have a lot more experience there. And I do think, you know, maybe for some of these frontier labs, when they bake in memory as one of the features, you do a lot of research to kind of understand some of the more advanced primitives that actually enable users to store various forms of state so that, you know, they don't have to retype like various types of contexts for the next task. But I think for general users, what I'm interested in a lot of times is just these dead simple abstractions that literally anyone can just take today. And

[44:34] Jerry Liu:
people can still use that as an initial instantiation of memory for basically any task that they do.

[44:41] Conor Bronsdon:
Yeah, I had a conversation, I mean, recorded this morning, actually, with Tyler Akito, who's the CTO of Red Panda. And I think it'll come out the week before this episode, so if it didn't, I apologize to everyone who's watching this and doesn't have that right context. But he talked about a paper that they're working on where they're working with a psychologist to basically identify trends in agents and think through, like, what are we missing as far as how agents think and operate in the world. And one of the things that came up for us and we talked briefly about was this idea of almost like cultural memory, the training memory, that we're starting to see emerge across different frontier model families. And I think the example that is obvious to folks who are maybe too much on Twitter like myself is, oh, OpenAI really loves goblins, evidently. Like, okay, GPT, like, sure. Or you can look at some of the ticks that Claude has picked up along the way and how it is kind of seen by many as treating things a little differently than the OpenAI family of models. And I think we're starting to see this kind of emerging, almost like model lineage. You know, you brought that up earlier in this conversation, this idea of, look, the models are converging on what they can do, but there's this like different flavors almost of what's happening there. And I'm curious if you have a perspective around, you know, what do you think is driving this? Is it related to essentially the training data and context of being provided? Is it, more about the alignment approach that the model teams are taking? Do you expect to see this kind of almost like differentiated diverging cultures occur? Like what do you expect to happen on the model front?

[46:25] Jerry Liu:
I think it's a fascinating topic. I will say, I probably don't have that much experience to actually speak to it, so I mostly speak to it as an observer. When I think about memory, especially because we're kind of like the builders on top of these models, it's typically the stuff that is individual user-based to help source state, to help solve your task, and less like the general memory. i think a lot of the pre-trained memory is super interesting though because obviously there's like personality differences between 5.5 and like opus and obviously they all have like these different texts they write a little bit differently i think

[46:56] Conor Bronsdon:
Yeah.

[46:56] Jerry Liu:
that's actually been one of my top complaints is a lot of these models at least in my opinion my humble opinion are absolutely trash at writing according to how i want them to actually write and it might be a skill issue on my end but basically like i think the the ability to actually have both this personality characteristics, which actually does flow into your writing too. I think a lot of it's a result of post-training. And you know, it's kind of interesting. There's all these RL environments that are being sold to all these Frontier Labs these days. I almost wonder if it's just like you do some data distribution of like all the different environments and sims that like every Frontier Lab is running, and then you kind of correlate that to the personality type

[47:39] Conor Bronsdon: [OVERLAP]
Huh.

[47:39] Jerry Liu: [OVERLAP]
of like what it actually ends up. I would be super curious. I just don't have access to any of this data, so I can't actually speak to it intelligently.

[47:46] Conor Bronsdon: [OVERLAP]
If someone wants to send Jerry and I that data, that would be fascinating. I think we would love to check that out.

[47:51] Jerry Liu:
Yeah. Like, you know, let's say this was like 50% tuned on like, you know, financial knowledge work, whereas this other frontier models like 30% tuned, but like 20% more tuned on like, you know, patients or, or doctor,

[48:01] Conor Bronsdon: [OVERLAP]
Yeah.

[48:01] Jerry Liu: [OVERLAP]
like, you know, doctors talking. Yeah. I think it'd be super interesting.

[48:05] Conor Bronsdon: [OVERLAP]
makes a lot of sense to me. I will bring up on the writing skills front briefly too as we're wrapping up here. I highly recommend what I've found successful is basically taking three classes of skills to help LLMs write on my behalf. One being like writing voice context, so building out like a context document of how I like to write, and

[48:28] Jerry Liu: [OVERLAP]
Yeah.

[48:28] Conor Bronsdon: [OVERLAP]
building a skill on top of that and then having a bunch of examples I can reference, particularly for different types of writing. Two, I built a skill that I open-sourced called Avoid AI Writing. It's done pretty well. It's got like 1,400 stars or something on it now. But basically it says, okay, here are all the ticks we see for models. Please just get rid of that. Think about running them differently. Do these things. You might find that useful. And then the third one I have is I have like an editor on top of that, where I have like a second pass editor where it's like, okay, like I wrote this in your voice. Then I went through it with the, the clean AI writing skill and said, okay, like let's get rid of all the M dashes. Let's cut down on these like repetitive steps because models while decent writers in some cases will often like they'll repeat themselves on like, oh, I I'm this thing works. I'm just going to use it 15 times in this essay. And it's like, good Lord. And then

[49:17] Jerry Liu: [OVERLAP]
Yeah,

[49:17] Conor Bronsdon: [OVERLAP]
like have that second pass editor

[49:18] Jerry Liu: [OVERLAP]
yeah,

[49:18] Conor Bronsdon: [OVERLAP]
and help it out.

[49:19] Jerry Liu:
it's funny. You know, I think I tried to create like my own writing skill. I think it roughly captures some of the stuff that you're talking about, just maybe with a little bit less sophistication. And it's interesting, you know, we're trying to hack our way around these models, like not being great writers themselves. I wonder if it will get poster and be better.

[49:35] Conor Bronsdon:
Yeah, I think, I mean, I think it will, but the problem I see is that everyone, I mean, we're seeing conversions of how people write in some areas, but I feel like anyone who really cares about writing the craft of it wants to have a bit of a differentiated voice too. So I do think it's crucial to have your own form of post-training through context and how you provision the model and instruct it as well.

[49:56] Jerry Liu:
For sure. Yeah, totally agree.

[49:58] Conor Bronsdon:
Uh, well, thank you for following me on this, this brief, uh, distraction around, uh, writing with AI, because I think it's a fascinating topic that I'll probably talk about more on this podcast at some point. Maybe I'll write an essay on it. Uh, but Jerry, it's been fantastic catching up with you. I appreciate you taking the time to come on the show and super excited to get this out. Do you have any closing thoughts you want to share with the audience?

[50:19] Jerry Liu: [OVERLAP]
Yeah, no, thanks so much for having me on the podcast. It was great to talk about kind of like the general, you know, macro evolutions of like AI itself and just like the concept of the context layer. I will say, you know, I think I'm not like an expert to like 100% accurate at like predicting what's going to happen in like the next year or two. But I do think from like first principles, this idea of like I think AI is something that is going to fundamentally change and get better. So it's helpful to think about exponentials in that

[50:49] Conor Bronsdon: [OVERLAP]
Hmm.

[50:50] Jerry Liu: [OVERLAP]
sense. But it's also helpful to think about what typically stays constant, what are just by definition things you need to enable the AI even as it gets exponentially better. That's basically a thought exercise that we did where we thought about the context layer in general. And for us, it meant narrowing a little bit down to focusing on document-based data. I think for a lot of builders in verticals or other industries, it's just helpful to think about what are things that these labs are likely going to pursue. Um, and then, you know, what are things that, you know, they're probably not just not going to have time for, or can't because of like the interface and those are areas that you could kind of go in and dig into. And so mine, my thoughts are just one perspective. Um, there's plenty of successful AI companies out there. Um, and so, you know, I, I hope at least some of this was like interesting.

[51:41] Conor Bronsdon:
I will definitely say it was interesting from my end, so I hope our listeners loved it as well. And I know they can find a lot more information from Jerry and LawmindX at lawmindx.ai, at Jerry's Twitter account, on LinkedIn, anywhere else, Jerry, they should go check out besides that careers page you mentioned earlier, which definitely they should.

[51:58] Jerry Liu:
Yeah, no, for sure. I mean, I think if you're, if you're generally interested, if you do have a lot of docs, you know, and you want to talk to us, we do have a self-serve product called model parse. And then, you know, we also have like a contact form on the site. But in general, if you also have some of these use cases, you're welcome to DM me on Twitter.

[52:13] Conor Bronsdon:
Amazing. Jerry, thank you so much for coming on the show. It's been great catching up with you. I'm going to have to try to write an essay for our Substack at newsletter.chainofthought.show based off of this discussion about writing and some of the great insights you provided. So super excited to get that out. And thanks so much for coming on.

[52:30] Jerry Liu:
Yep. Thanks so much for having me.