{"type":"rich","version":"1.0","provider_name":"Transistor","provider_url":"https://transistor.fm","author_name":"Spatial Stack with Matt Forrest","title":"#38: How Apache Sedona Solved Big Data’s Hardest Problem with Jia Yu","html":"<iframe width=\"100%\" height=\"180\" frameborder=\"no\" scrolling=\"no\" seamless src=\"https://share.transistor.fm/e/d37fa7f3\"></iframe>","width":"100%","height":180,"duration":3292,"description":"Large Language Models can write poetry and debug code, but they still don't understand the fundamental physics of the real world. Ask an AI to find the \"nearest restaurant\" to a specific coordinate, and it struggles because it lacks Spatial Intelligence.In this episode, we sit down with Jia Yu, the co-creator of Apache Sedona and co-founder of Wherobots, to discuss why geospatial data breaks standard big data engines and how he built the solution that now powers over 2 million downloads a month.We trace the 10-year journey from a PhD research paper to a top-level Apache project, diving into the deep technical challenges of distributed computing. Jia explains why spatial data requires a completely different architecture than standard text or numbers and how the industry is finally moving toward a \"Spatial Lakehouse\" to break down data silos.In this episode, we explore:- The \"Multimodality\" Trap: Why mixing vector, raster, and LiDAR data crashes traditional systems.- How SedonaDB is bringing massive scale to single-node machines (so you don't always need a cluster).- The hardest problem in distributed computing - How to split a map across 1,000 servers without breaking the data.- The multi-year fight to get native geometry support into Apache Iceberg.- Why the next generation of models must evolve from text-based to spatially intelligent.✅ Sign Up for Wherobots: https://wherobots.com/✅ Learn more about Apache Sedona: https://wherobots.com/apache-sedona/✅ What is Apache Sedona: https://wherobots.com/blog/what-is-apache-sedona/✅ Test out SedonaDB: https://sedona.apache.org/sedonadb/latest/✅ Connect with Jia on LinkedIn: https://www.linkedin.com/in/dr-jia-yu/ 00:00:00 - Intro & Welcome 00:00:51 - The Origin Story: From GeoSpark to Apache Sedona 00:06:03 - Why Geospatial Data is \"Special\" (The Multimodality Problem) 00:09:47 - When to Move to Distributed Computing? 00:13:21 - The Secret to Maintaining a Vibrant Open Source Community00:18:11 - The Features That Drove...","thumbnail_url":"https://img.transistorcdn.com/Xz7Kw-USgUgqeYeirGjCzTlP15he6re5sNuvcBU5SJM/rs:fill:0:0:1/w:400/h:400/q:60/mb:500000/aHR0cHM6Ly9pbWct/dXBsb2FkLXByb2R1/Y3Rpb24udHJhbnNp/c3Rvci5mbS85OTgy/MDVhY2U5MjhiZDUw/NDBkNzAzYzIwNjk5/YjU1My5wbmc.webp","thumbnail_width":300,"thumbnail_height":300}