{"type":"rich","version":"1.0","provider_name":"Transistor","provider_url":"https://transistor.fm","author_name":"Pretrained","title":"Why Your Agent is Cheating","html":"<iframe width=\"100%\" height=\"180\" frameborder=\"no\" scrolling=\"no\" seamless src=\"https://share.transistor.fm/e/f3691b23\"></iframe>","width":"100%","height":180,"duration":3659,"description":"Pierce and Richard are back for the second listener mailbag. They break down what reward hacking really is and why models so often learn the wrong lesson, explain practical fine-tuning (from pre-training to prompting), unpack why LLMs use tokens instead of words, how context length is a hardware versus mathematic limitation, and much more.","thumbnail_url":"https://img.transistorcdn.com/8veBHYJ1tFjtWlv9ET3YGaLijqZK5MYE6tVoUbwgKaw/rs:fill:0:0:1/w:400/h:400/q:60/mb:500000/aHR0cHM6Ly9pbWct/dXBsb2FkLXByb2R1/Y3Rpb24udHJhbnNp/c3Rvci5mbS8yYTRi/YTQ5Zjk4ZjIzMmU2/YzRiMWZjN2E5ZmJk/NzNjNi5wbmc.webp","thumbnail_width":300,"thumbnail_height":300}