How AI Is Built

Modern search systems face a complex balancing act between performance, relevancy, and cost, requiring careful architectural decisions at each layer.

While vector search generates buzz, hybrid approaches combining traditional text search with vector capabilities yield better results.

The architecture typically splits into three core components:

ingestion/indexing (requiring decisions between batch vs streaming)
query processing (balancing understanding vs performance)
analytics/feedback loops for continuous improvement.

Critical but often overlooked aspects include query understanding depth, systematic relevancy testing (avoid anecdote-driven development), and data governance as search systems naturally evolve into organizational data hubs.

Performance optimization requires careful tradeoffs between index-time vs query-time computation, with even 1-2% improvements being significant in mature systems.

Success requires testing against production data (staging environments prove unreliable), implementing proper evaluation infrastructure (golden query sets, A/B testing, interleaving), and avoiding the local maxima trap where improving one query set unknowingly damages others.

The end goal is finding an acceptable balance between corpus size, latency requirements, and cost constraints while maintaining system manageability and relevance quality.

"It's quite easy to end up in local maxima, whereby you improve a query for one set and then you end up destroying it for another set."

"A good marker of a sophisticated system is one where you actually see it's getting worse... you might be discovering a maxima."

"There's no free lunch in all of this. Often it's a case that, to service billions of documents on a vector search, less than 10 millis, you can do those kinds of things. They're just incredibly expensive. It's really about trying to manage all of the overall system to find what is an acceptable balance."

Search Pioneers:

Stuart Cam:

Russ Cam:

Nicolay Gerold:

00:00 Introduction to Search Systems 00:13 Challenges in Search: Relevancy vs Latency 00:27 Insights from Industry Experts 01:00 Evolution of Search Technologies 03:16 Storage and Compute in Search Systems 06:22 Common Mistakes in Building Search Systems 09:10 Evaluating and Improving Search Systems 19:27 Architectural Components of Search Systems 29:17 Understanding Search Query Expectations 29:39 Balancing Speed, Cost, and Corpus Size 32:03 Trade-offs in Search System Design 32:53 Indexing vs Querying: Key Considerations 35:28 Re-ranking and Personalization Challenges 38:11 Evaluating Search System Performance 44:51 Overrated vs Underrated Search Techniques 48:31 Final Thoughts and Contact Information

What is How AI Is Built ?

Real engineers. Real deployments. Zero hype. We interview the top engineers who actually put AI in production. Learn what the best engineers have figured out through years of experience. Hosted by Nicolay Gerold, CEO of Aisbach and CTO at Proxdeal and Multiply Content.

#029 Search Systems at Scale, Avoiding Local Maxima and Other Engineering Lessons

#029 Search Systems at Scale, Avoiding Local Maxima and Other Engineering Lessons#029 Search Systems at Scale, Avoiding Local Maxima and Other Engineering Lessons

More episodes

#029 Search Systems at Scale, Avoiding Local Maxima and Other Engineering Lessons

#029 Search Systems at Scale, Avoiding Local Maxima and Other Engineering Lessons

Chapters

What is How AI Is Built ?