How AI Is Built | Knowledge Graphs for Better RAG, Virtual Entities, Hybrid Data Models

Kirk Marple, CEO and founder of Graphlit, discusses the evolution of his company from a data cataloging tool to an platform designed for ETL (Extract, Transform, Load) and knowledge retrieval for Large Language Models (LLMs). Graphlit empowers users to build custom applications on top of its API that go beyond naive RAG.
Key Points:

Knowledge Graphs: Graphlet utilizes knowledge graphs as a filtering layer on top of keyword metadata and vector search, aiding in information retrieval.
Storage for KGs: A single piece of content in their data model resides across multiple systems: a document store with JSON, a graph node, and a search index. This hybrid approach creates a virtual entity with representations in different databases.
Entity Extraction: Azure Cognitive Services and other models are employed to extract entities from text for improved understanding.
Metadata-first approach: The metadata-first strategy involves extracting comprehensive metadata from various sources, ensuring it is canonicalized and filterable. This approach aids in better indexing and retrieval of data, crucial for effective RAG.
Challenges: Entity resolution and deduplication remain significant challenges in knowledge graph development.

Notable Quotes:

"Knowledge graphs is a filtering [mechanism]...but then I think also the kind of spidering and pulling extra content in is the other place this comes into play."
"Knowledge graphs to me are kind of like index per se...you're providing a new type of index on top of that."
"[For RAG]...you have to find constraints to make it workable."
"Entity resolution, deduping, I think is probably the number one thing."
"I've essentially built a connector infrastructure that would be like a FiveTran or something that Airflow would have..."
"One of the reasons is because we're a platform as a service, the burstability of it is really important. We can spin up to a hundred instances without any problem, and we don't have to think about it."
"Once cost and performance become a no-brainer, we're going to start seeing LLMs be more of a compute tool. I think that would be a game-changer for how applications are built in the future."

Kirk Marple:

LinkedIn
X (Twitter)
Graphlit
Graphlit Docs

Nicolay Gerold:

⁠LinkedIn⁠
⁠X (Twitter)

Chapters
00:00 Graphlit’s Hybrid Approach
02:23 Use Cases and Transition to Graphlit
04:19 Knowledge Graphs as a Filtering Mechanism
13:23 Using Gremlin for Querying the Graph
32:36 XML in Prompts for Better Segmentation
35:04 The Future of LLMs and Graphlit
36:25 Getting Started with Graphlit
Graphlit, knowledge graphs, AI, document store, graph database, search index co-pilot, entity extraction, Azure Cognitive Services, XML, event-driven architecture, serverless architecture graph rag, developer portal

Show Notes

Key Points:

Knowledge Graphs: Graphlet utilizes knowledge graphs as a filtering layer on top of keyword metadata and vector search, aiding in information retrieval.
Storage for KGs: A single piece of content in their data model resides across multiple systems: a document store with JSON, a graph node, and a search index. This hybrid approach creates a virtual entity with representations in different databases.
Entity Extraction: Azure Cognitive Services and other models are employed to extract entities from text for improved understanding.
Metadata-first approach: The metadata-first strategy involves extracting comprehensive metadata from various sources, ensuring it is canonicalized and filterable. This approach aids in better indexing and retrieval of data, crucial for effective RAG.
Challenges: Entity resolution and deduplication remain significant challenges in knowledge graph development.

Notable Quotes:

"Knowledge graphs is a filtering [mechanism]...but then I think also the kind of spidering and pulling extra content in is the other place this comes into play."
"Knowledge graphs to me are kind of like index per se...you're providing a new type of index on top of that."
"[For RAG]...you have to find constraints to make it workable."
"Entity resolution, deduping, I think is probably the number one thing."
"I've essentially built a connector infrastructure that would be like a FiveTran or something that Airflow would have..."
"One of the reasons is because we're a platform as a service, the burstability of it is really important. We can spin up to a hundred instances without any problem, and we don't have to think about it."
"Once cost and performance become a no-brainer, we're going to start seeing LLMs be more of a compute tool. I think that would be a game-changer for how applications are built in the future."

Kirk Marple:

Nicolay Gerold:

Chapters

00:00 Graphlit’s Hybrid Approach 02:23 Use Cases and Transition to Graphlit 04:19 Knowledge Graphs as a Filtering Mechanism 13:23 Using Gremlin for Querying the Graph 32:36 XML in Prompts for Better Segmentation 35:04 The Future of LLMs and Graphlit 36:25 Getting Started with Graphlit

Graphlit, knowledge graphs, AI, document store, graph database, search index co-pilot, entity extraction, Azure Cognitive Services, XML, event-driven architecture, serverless architecture graph rag, developer portal

What is How AI Is Built ?

How AI is Built dives into the different building blocks necessary to develop AI applications: how they work, how you can get started, and how you can master them. Build on the breakthroughs of others. Follow along, as Nicolay learns from the best data engineers, ML engineers, solution architects, and tech founders.