{"type":"rich","version":"1.0","provider_name":"Transistor","provider_url":"https://transistor.fm","author_name":"Pop Goes the Stack","title":"Measuring what matters: Observability for agents","html":"<iframe width=\"100%\" height=\"180\" frameborder=\"no\" scrolling=\"no\" seamless src=\"https://share.transistor.fm/e/79829e21\"></iframe>","width":"100%","height":180,"duration":1224,"description":"Agents break the old rules of observability. Latency, throughput, and error rates still matter, but once software starts making decisions and taking actions on someone else’s behalf, the real question becomes: is it doing the right thing, and is it doing it for the right reasons? In this episode of Pop Goes the Stack, Lori MacVittie and Joel “OpenClaw” Moses are joined by observability expert Chris Hain to unpack what changes when systems become agentic. Instead of a single prompt-response interaction, you get decision chains that branch, loop, call tools, and evolve over time. A system can “succeed” operationally while still being wrong, expensive, or misaligned with intent. Chris argues you don’t have to throw away what already works. Distributed tracing still applies, but now each agent step becomes a span, decorated with richer metadata like model identity, tool calls, token usage, prompts, and cost. The discussion also dives into why standardization matters, including OpenTelemetry and emerging semantic conventions for generative and agentic AI, and why auto-instrumentation approaches like eBPF become critical when agents generate code that has no built-in telemetry. Joel adds a new set of metrics that feel uncomfortably necessary: decision loops per task, drift in tool-call chains, human override frequency, and the cost and token patterns that signal something has changed. The group also tackles the awkward feedback loop of using agents to make observability actionable, while acknowledging the risk of agents optimizing the dashboard instead of the system. If you’re building agentic workflows, this episode is a practical guide to why “failed successfully” is now a real production state, and why instrumenting for correctness and intent alignment is the next observability frontier.","thumbnail_url":"https://img.transistorcdn.com/EOH5giVF50GDCoaIBECLMap8fBWcZH3C5tsFwM0Tn9s/rs:fill:0:0:1/w:400/h:400/q:60/mb:500000/aHR0cHM6Ly9pbWct/dXBsb2FkLXByb2R1/Y3Rpb24udHJhbnNp/c3Rvci5mbS80MGQ2/ZDBjM2JjMmMyZDg0/MGY5ZTEyYTViOTgy/N2RiYS5wbmc.webp","thumbnail_width":300,"thumbnail_height":300}