My posts on RL as I self-study. Each post breaks down a core concept with a mixture of theory and implementation. My motivation for these posts is to drastically shorten the learning curve for people that want to learn these concepts. There is a lot of amazing learning resources out there but the differentiation here is a focus on fundamental RL concepts where we do an end-to-end path from concept to functional implementation.

No-bullshit understanding to...

Policy Iteration

Monte Carlo Control with Blackjack (WIP)

Temporal Difference Learning with Blackjack (WIP)

Helpful links for learning RL

https://www.youtube.com/watch?v=2pWv7GOvuf0&list=PLqYmG7hTraZDM-OYHWgPebj2MfCFzFObQ&index=1

Currently trying to understand the following papers: