Into AI Safety

Join our hackathon group for the second episode in the Evals November 2023 Hackathon subseries. In this episode, we solidify our goals for the hackathon after some preliminary experimentation and ideation.Check out Stellaric's website, or follow them on Twitter.01:53 - Meeting starts05:05 - Pitch: extension of locked models23:23 - Pitch: retroactive holdout datasets34:04 - Preliminary results37:44 - Next steps42:55 - RecapLinks to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance.Evalugator libraryPassword Locked Model blogpostTruthfulQA: Measuring How Models Mimic Human FalsehoodsBLEU: a Method for Automatic Evaluation of Machine TranslationBoolQ: Exploring the Surprising Difficulty of Natural Yes/No QuestionsDetecting Pretraining Data from Large Language Models

Show Notes

Join our hackathon group for the second episode in the Evals November 2023 Hackathon subseries. In this episode, we solidify our goals for the hackathon after some preliminary experimentation and ideation.

Check out Stellaric's website, or follow them on Twitter.

01:53 - Meeting starts
05:05 - Pitch: extension of locked models
23:23 - Pitch: retroactive holdout datasets
34:04 - Preliminary results
37:44 - Next steps
42:55 - Recap

Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance.

Creators & Guests

Host
Jacob Haimes
Host of the podcast and all-around great dude.
Editor
Chase Precopia

What is Into AI Safety?

The Into AI Safety podcast aims to make it easier for everyone, regardless of background, to get meaningfully involved with the conversations surrounding the rules and regulations which should govern the research, development, deployment, and use of the technologies encompassed by the term "artificial intelligence" or "AI"

For better formatted show notes, additional resources, and more, go to https://into-ai-safety.github.io
For even more content and community engagement, head over to my Patreon at https://www.patreon.com/IntoAISafety