Tic-Tac-Toe the Hard Way

Switching gears, we focus on how Yannick’s been training his model using reinforcement learning. He explains the differences from David’s supervised learning approach. We find out how his system performs against a player that makes random tic-tac-toe moves.

Show Notes

Switching gears, we focus on how Yannick’s been training his model using reinforcement learning.  He explains the differences from David’s supervised learning approach. We find out how his system performs against a player that makes random tic-tac-toe moves.

Resources: 
Deep Learning for JavaScript book
Playing Atari with Deep Reinforcement Learning
Two Minute Papers episode on Atari DQN

For more information about the show, check out pair.withgoogle.com/thehardway/.

You can reach out to the hosts on Twitter: @dweinberger and @tafsiri

What is Tic-Tac-Toe the Hard Way ?

A writer and a software engineer from Google's People + AI Research team explore the human choices that shape machine learning systems by building competing tic-tac-toe agents.