ML Safety Report

In this week’s ML & AI Safety Update, we hear Paul Christiano’s take on one of OpenAI’s main alignment strategies, dive into the second round winners of the inverse scaling prize and share the many fascinating projects from our mechanistic interpretability hackathon. And stay tuned until the end for some unique opportunities in AI safety!

Show Notes

In this week’s ML & AI Safety Update, we hear Paul Christiano’s take on one of OpenAI’s main alignment strategies, dive into the second round winners of the inverse scaling prize and share the many fascinating projects from our mechanistic interpretability hackathon!

Opportunities (https://ais.pub/aistraining)
Sources

What is ML Safety Report?

A weekly podcast updating you with the latest research in AI and machine learning safety from people such as DeepMind, Anthropic, and MIRI.