{"type":"rich","version":"1.0","provider_name":"Transistor","provider_url":"https://transistor.fm","author_name":"How I Tested That","title":"Chris Butler | How I Test AI Agents at GitHub","html":"<iframe width=\"100%\" height=\"180\" frameborder=\"no\" scrolling=\"no\" seamless src=\"https://share.transistor.fm/e/1066b7cf\"></iframe>","width":"100%","height":180,"duration":2947,"description":"SummaryIn this episode I’m joined by Chris Butler. He’s a longtime product leader and operator whose career spans companies such as Microsoft, Google, Facebook, and now GitHub, where he works on agentic workflows across the organization.We explore how AI is reshaping the way modern product teams think, collaborate, ship and its ripple effects on how we manage process and decision making.  Chris and I chat about the messy realities behind agentic systems such as why removing too much friction can actually hurt decision quality and why qualitative research matters more now than ever before. Chris gives a candid behind the scenes look into what’s working, what’s failing, and why experimentation itself may become one of the most important capabilities in the AI era.If you’ve been wondering what testing AI Agents actually looks like inside a cutting edge company, this episode is for you.TakeawaysAI is collapsing traditional product development workflows, but not necessarily eliminating the need for product managers, engineers, or designers. Instead, roles are decomposing into smaller tasks where humans and machines each handle different types of work.Removing all friction from product development can actually reduce decision quality. Chris argues that tension between desirability, viability, and feasibility perspectives is still critical because reasoning often happens through human discussion, not just inside individual minds or AI systems.AI-generated “rude feedback” tools can help teams improve ideas faster because people are often more receptive to harsh critique from a machine than from another human. GitHub experimented with sarcastic AI Q&A systems that surfaced weak assumptions and missing details without the reputational risk of peer criticism.The future of AI inside organizations may be less about autonomous agents replacing humans and more about “process as code.” GitHub is experimenting with natural-language policy documents that both humans and agents can...","thumbnail_url":"https://img.transistorcdn.com/hRAQ0Cvexq2Nhl7H1KPLfxWZ14skSKkH4xG8JMRnoOM/rs:fill:0:0:1/w:400/h:400/q:60/mb:500000/aHR0cHM6Ly9pbWct/dXBsb2FkLXByb2R1/Y3Rpb24udHJhbnNp/c3Rvci5mbS9zaG93/LzUwMDU0LzE3MDg3/MTI0NTQtYXJ0d29y/ay5qcGc.webp","thumbnail_width":300,"thumbnail_height":300}