Large Language Models can write poetry and debug code, but they still don't understand the fundamental physics of the real world. Ask an AI to find the "nearest restaurant" to a specific coordinate, and it struggles because it lacks Spatial Intelligence.
In this episode, we sit down with Jia Yu, the co-creator of Apache Sedona and co-founder of Wherobots, to discuss why geospatial data breaks standard big data engines and how he built the solution that now powers over 2 million downloads a month.
We trace the 10-year journey from a PhD research paper to a top-level Apache project, diving into the deep technical challenges of distributed computing. Jia explains why spatial data requires a completely different architecture than standard text or numbers and how the industry is finally moving toward a "Spatial Lakehouse" to break down data silos.
In this episode, we explore:
- The "Multimodality" Trap: Why mixing vector, raster, and LiDAR data crashes traditional systems.
- How SedonaDB is bringing massive scale to single-node machines (so you don't always need a cluster).
- The hardest problem in distributed computing - How to split a map across 1,000 servers without breaking the data.
- The multi-year fight to get native geometry support into Apache Iceberg.
- Why the next generation of models must evolve from text-based to spatially intelligent.
✅ Sign Up for Wherobots: https://wherobots.com/
✅ Learn more about Apache Sedona: https://wherobots.com/apache-sedona/
✅ What is Apache Sedona: https://wherobots.com/blog/what-is-apache-sedona/
✅ Test out SedonaDB: https://sedona.apache.org/sedonadb/latest/
✅ Connect with Jia on LinkedIn: https://www.linkedin.com/in/dr-jia-yu/
00:00:00 - Intro & Welcome
00:00:51 - The Origin Story: From GeoSpark to Apache Sedona
00:06:03 - Why Geospatial Data is "Special" (The Multimodality Problem)
00:09:47 - When to Move to Distributed Computing?
00:13:21 - The Secret to Maintaining a Vibrant Open Source Community
00:18:11 - The Features That Drove Adoption: Spatial SQL & Python
00:22:35 - Deep Dive: How Spatial Partitioning Works
00:28:57 - Why Build a Cloud-Native Platform?
00:33:05 - The Rise of the Spatial Lakehouse & Apache Iceberg
00:40:17 - Introducing SedonaDB: A Single-Node Engine
00:45:10 - The Future: Why AI Needs Spatial Intelligence
00:48:44 - Advice for Getting Started with Spatial Engineering
📰 Daily modern GIS insights: https://forrest.nyc
CONNECT WITH ME
📸 Instagram: https://www.instagram.com/matt_forrest/
💼 LinkedIn: https://www.linkedin.com/in/mbforr/
📧 Newsletter: https://forrest.nyc
🌐 Website: https://forrest.nyc