Spatial Stack with Matt Forrest

Large Language Models can write poetry and debug code, but they still don't understand the fundamental physics of the real world. Ask an AI to find the "nearest restaurant" to a specific coordinate, and it struggles because it lacks Spatial Intelligence.

In this episode, we sit down with Jia Yu, the co-creator of Apache Sedona and co-founder of Wherobots, to discuss why geospatial data breaks standard big data engines and how he built the solution that now powers over 2 million downloads a month.

We trace the 10-year journey from a PhD research paper to a top-level Apache project, diving into the deep technical challenges of distributed computing. Jia explains why spatial data requires a completely different architecture than standard text or numbers and how the industry is finally moving toward a "Spatial Lakehouse" to break down data silos.

In this episode, we explore:

- The "Multimodality" Trap: Why mixing vector, raster, and LiDAR data crashes traditional systems.

- How SedonaDB is bringing massive scale to single-node machines (so you don't always need a cluster).

- The hardest problem in distributed computing - How to split a map across 1,000 servers without breaking the data.

- The multi-year fight to get native geometry support into Apache Iceberg.

- Why the next generation of models must evolve from text-based to spatially intelligent.

✅ Sign Up for Wherobots: https://wherobots.com/
✅ Learn more about Apache Sedona: https://wherobots.com/apache-sedona/
✅ What is Apache Sedona: https://wherobots.com/blog/what-is-apache-sedona/
✅ Test out SedonaDB: https://sedona.apache.org/sedonadb/latest/
✅ Connect with Jia on LinkedIn: https://www.linkedin.com/in/dr-jia-yu/ 

00:00:00 - Intro & Welcome 
00:00:51 - The Origin Story: From GeoSpark to Apache Sedona 
00:06:03 - Why Geospatial Data is "Special" (The Multimodality Problem) 
00:09:47 - When to Move to Distributed Computing? 
00:13:21 - The Secret to Maintaining a Vibrant Open Source Community
00:18:11 - The Features That Drove Adoption: Spatial SQL & Python 
00:22:35 - Deep Dive: How Spatial Partitioning Works 
00:28:57 - Why Build a Cloud-Native Platform? 
00:33:05 - The Rise of the Spatial Lakehouse & Apache Iceberg 
00:40:17 - Introducing SedonaDB: A Single-Node Engine 
00:45:10 - The Future: Why AI Needs Spatial Intelligence 
00:48:44 - Advice for Getting Started with Spatial Engineering

📰 Daily modern GIS insights: https://forrest.nyc

CONNECT WITH ME
📸 Instagram:  https://www.instagram.com/matt_forrest/
💼 LinkedIn: https://www.linkedin.com/in/mbforr/
📧 Newsletter: https://forrest.nyc
🌐 Website: https://forrest.nyc

What is Spatial Stack with Matt Forrest?

Welcome to The Spatial Stack, where modern geospatial technology takes center stage. Our episodes feature interviews with leading experts, insightful discussions on the integration of AI and big data in spatial tech, and case studies on groundbreaking projects worldwide. Tune in to stay ahead in the rapidly evolving world of geospatial technology!