I have been teaching a course on reinforcement learning for five years. Along with my teaching, I developed a textbook and an open course entitled "Mathematical Foundations of Reinforcement Learning".
The PDF of the book as well as lecture notes are available on the GitHub homepage. The textbook has received 10,000+ stars on GitHub!
Novelty: This book provides a mathematical but friendly introduction to reinforcement learning. Its content structure is also novel.
I developed both Chinese and English lecture videos. You can check our Bilibili channel or our YouTube channel
The lecture videos have received 1,600,000+ views over the Internet and received very good feedback!
Below are the links to my English lecture videos.
L1: Basic Concepts (P2-Reward,return, Markov decision process)
L4: Value Iteration and Policy Iteration (P1-Value iteration)
L4: Value Iteration and Policy Iteration (P2-Policy iteration)
L4: Value Iteration and Policy Iteration (P3-Truncated policy iteration)
L5: Monte Carlo Learning (P5-MC Epsilon-Greedy-introduction)
L6: Stochastic Approximation and SGD (P1-Motivating example)
L6: Stochastic Approximation and SGD (P2-RM algorithm: introduction)
L6: Stochastic Approximation and SGD (P3-RM algorithm: convergence)
L6: Stochastic Approximation and SGD (P4-SGD algorithm: introduction)
L6: Stochastic Approximation and SGD (P5-SGD algorithm: examples)
L6: Stochastic Approximation and SGD (P6-SGD algorithm: properties)
L6: Stochastic Approximation and SGD (P7-SGD algorithm: comparison)
L7: Temporal-Difference Learning (P2-TD algorithm: introduction)
L7: Temporal-Difference Learning (P3-TD algorithm: convergence)
L7: Temporal-Difference Learning (P5-Expected Sarsa & n-step Sarsa)
L7: Temporal-Difference Learning (P6-Q-learning: introduction)
L7: Temporal-Difference Learning (P7-Q-learning: pseudo code)
L7: Temporal-Difference Learning (P8-Unified viewpoint and summary)
L8: Value Function Approximation (P1-Motivating example – curve fitting)
L8: Value Function Approximation (P3-Optimization algorithm)
L8: Value Function Approximation (P4-illustrative examples and analysis)
L8: Value Function Approximation (P7-DQN – experience replay)
L8: Value Function Approximation (P8-DQN – implementation and example)
The slides for the above lecture videos can be found here: https://github.com/MathFoundationRL/Book-Mathematical-Foundation-of-Reinforcement-Learning/tree/main/Lecture%20slides