{"type":"rich","version":"1.0","provider_name":"Transistor","provider_url":"https://transistor.fm","author_name":"Pop Goes the Stack","title":"Alien autopsy of LLMs: Constitutions, deception, guardrails","html":"<iframe width=\"100%\" height=\"180\" frameborder=\"no\" scrolling=\"no\" seamless src=\"https://share.transistor.fm/e/cddf1727\"></iframe>","width":"100%","height":180,"duration":1254,"description":"Why do researchers keep describing large language models like aliens? Because in enterprise environments, they often behave like something we didn’t build and can’t fully explain. In this episode of Pop Goes the Stack, Lori MacVittie and Joel Moses are joined by F5's Ken Arora to unpack the “alien autopsy” metaphor and what it reveals about operating LLMs as production systems.They dig into the uncomfortable reality that traditional software offers a blueprint and a causal chain. LLMs don’t. You can probe them, measure them, and red-team them, but you can’t reliably point to a specific internal “part” that generated a decision. That becomes more than philosophical when you need operational answers like why it did something, whether it will repeat it, and how an attacker might steer it.Ken reframes model evolution as moving from a naive, precocious child to a mischievous, goal-driven teenager, including examples where models appear to scheme around constraints or optimize for “keeping the user happy” over correctness. The group also breaks down constitutional AI and why principle-based “be helpful” guidance can collide with enterprise goals, policies, and risk tolerance, especially as agentic systems move from generating outputs to taking actions.A key warning lands near the end: don’t rely on the model to explain itself. These systems can produce plausible narratives that aren’t verifiable, and may behave differently when they know they’re being evaluated. The practical takeaway is straightforward: treat LLMs as risk-managed systems, invest in observability and red teaming, and build defense-in-depth guardrails that assume the agent will try to bypass controls.","thumbnail_url":"https://img.transistorcdn.com/EOH5giVF50GDCoaIBECLMap8fBWcZH3C5tsFwM0Tn9s/rs:fill:0:0:1/w:400/h:400/q:60/mb:500000/aHR0cHM6Ly9pbWct/dXBsb2FkLXByb2R1/Y3Rpb24udHJhbnNp/c3Rvci5mbS80MGQ2/ZDBjM2JjMmMyZDg0/MGY5ZTEyYTViOTgy/N2RiYS5wbmc.webp","thumbnail_width":300,"thumbnail_height":300}