{"type":"rich","version":"1.0","provider_name":"Transistor","provider_url":"https://transistor.fm","author_name":"Smooth Scaling: System Design for High Traffic","title":"Handling 200k Requests Per Second Surges with Zalando SRE Manager Johannes Boumans","html":"<iframe width=\"100%\" height=\"180\" frameborder=\"no\" scrolling=\"no\" seamless src=\"https://share.transistor.fm/e/c7575740\"></iframe>","width":"100%","height":180,"duration":2586,"description":"In this episode, Johannes Boumans, Engineering Manager in Zalando’s SRE team, shares how Lounge by Zalando handles daily surges of up to 200,000 requests per second. He discusses the shift from monoliths to microservices, the “you build it, you run it” model, SRE champions, and the trade-offs behind reliability, fairness, and cost. From bot defense to chaos engineering, it’s a deep dive into scaling one of Europe’s largest e-commerce platforms.Episode page---Johannes Boumans is an Engineering Manager in the SRE organization at Zalando, where he leads reliability efforts for Zalando Lounge, the company’s off-price shopping destination. Over nearly 10 years at Zalando, Johannes has grown from product support into SRE leadership, where he now supports 25 engineering teams in building resilient, fair, and scalable systems. Johannes is passionate about the “you build it, you run it” philosophy and champions practices like chaos engineering, predictive scaling, and bot defense to keep systems reliable.This podcast is hosted by José Quaresma, researched by Joseph Thwaites and produced by Perseu Mandillo.00:00 – Intro01:28 – Zalando: Europe's leading fashion destination02:42 – The company’s rapid tech evolution since 200803:41 – From one team to 25: Johannes’ journey05:48 – How the SRE champions model works08:00 – What reliability really means at Zalando09:27 – From monolith to full DevOps accountability11:32 – What makes Lounge by Zalando unique12:50 – Dealing with massive daily traffic spikes14:05 – Predictive scaling and real-time cost control17:15 – First-come, first-served: fairness at scale22:11 – Solving the challenges of limited inventory25:09 – Combating bots with layered protections27:12 – Trade-offs: performance vs. experience29:38 – Why Lounge doesn’t have a search function31:17 – Advice for engineering managers facing traffic surges34:25 – Chaos testing in production—including turning off zones35:53 – Scaling advice for daily vs. seasonal peaks37:55 –...","thumbnail_url":"https://img.transistorcdn.com/NUjWPZClJT8sBh8TJHB749aD71YNKszNFP0UX56QWiY/rs:fill:0:0:1/w:400/h:400/q:60/mb:500000/aHR0cHM6Ly9pbWct/dXBsb2FkLXByb2R1/Y3Rpb24udHJhbnNp/c3Rvci5mbS9hY2Jj/MTdjYjIxN2VmOThi/N2NiNmM5NmI4OWEy/ZGFiNi5qcGc.webp","thumbnail_width":300,"thumbnail_height":300}