Fallthrough

Matt and Kris welcome back Bill Kennedy who's been working on Kronk, AI tooling in Go. They go through what Kronk (and it's dependent library Yzma) is, the power of running models locally, and how they see coding agents and LLMs shaping the future of the industry.

We've got supporter content, of course! This week that includes the message-caching trick that turns seconds into thirty milliseconds, why hybrid models quietly wreck the KV cache, Bill's "thirty billion knobs" explainer for dense versus mixture-of-experts models, why demo day is destroying engineering teams, and the stealth-startup math nobody says out loud. Not a supporter yet? Fix that today by heading over to https://fallthrough.fm/subscribe where you'll get not only extra content but also higher quality audio. Sign up today!

If you prefer to watch this episode, you can view it on YouTube.

No episode of the aftershow this week. We'll have more aftershow episodes soon! In the meantime, catch up on previous episodes at https://break.show.

Thanks for tuning in and happy listening!

Table of Contents:
  • Prologue (00:00:00)
  • Chapter 1: Why Go for AI, and the Origin of Kronk (00:02:14)
  • Chapter 2: What Kronk Is, an SDK to Kill the Model Server (00:08:56)
  • Chapter 3: Running It Yourself: Hugging Face, GGUF, and Hardware (00:14:58)
  • Chapter 8: Context, Attention, and the Honesty of Claude 4.8 (00:22:28)
  • Chapter 9: A Coding Agent Is a Skill, Not a Magic Wand (00:27:22)
  • Chapter 10: Semantics vs Mechanics (00:44:18)
  • Chapter 11: Rebuilding From Scratch: Specs and the Machine That Builds the Machine (00:47:29)
  • Chapter 14: Why AMP Wins: Pedigree and Economics (01:00:20)
  • Chapter 15: How to Actually Drive a Model: Plinko, Rocky Balboa, and Attention (01:14:59)
  • Epilogue (01:25:08)

Socials:
  • (00:00) - Prologue
  • (02:14) - Chapter 1: Why Go for AI, and the Origin of Kronk
  • (08:56) - Chapter 2: What Kronk Is, an SDK to Kill the Model Server
  • (14:58) - Chapter 3: Running It Yourself: Hugging Face, GGUF, and Hardware
  • (22:28) - Chapter 8: Context, Attention, and the Honesty of Claude 4.8
  • (27:22) - Chapter 9: A Coding Agent Is a Skill, Not a Magic Wand
  • (44:18) - Chapter 10: Semantics vs Mechanics
  • (47:29) - Chapter 11: Rebuilding From Scratch: Specs and the Machine That Builds the Machine
  • (01:00:20) - Chapter 14: Why AMP Wins: Pedigree and Economics
  • (01:14:59) - Chapter 15: How to Actually Drive a Model: Plinko, Rocky Balboa, and Attention
  • (01:25:08) - Epilogue

Creators and Guests

Host
Kris Brandow
Host
Matthew Sanabria
Matthew is an engineering leader focused on building reliable, scalable, and observable systems. Matthew is known for using his breadth and depth of experience to add value in minimal context situations and help great people become great engineers through mentoring. Matthew serves the Go community as a member of GoBridge. In his spare time, Matthew spends time with his family, helps grow his wife's chocolate business, works on home improvement projects, and reads technical resources to learn and tinker.
Guest
Bill Kennedy

What is Fallthrough?

A deep and nuanced conversational podcast focused on technology, software, and computing.