Interested in joining our next batch? Applications open soon.

Interested in joining our next batch? Applications open soon.

Degree Level Course

Special topics in Machine Learning (Reinforcement Learning)

To enable the student to understand the reinforcement learning paradigm, to be able to identify when an RL formulation is appropriate, to understand the basic solution approaches in RL, to implement and evaluate various RL algorithms.

by Balaraman Ravindran

Course ID: BSCCS4002

Course Credits: 4

Course Type: Elective

Pre-requisites: None

Course structure & Assessments

12 weeks of coursework, weekly online assignments, 2 in-person invigilated quizzes, 1 in-person invigilated end term exam. For details of standard course structure and assessments, visit Academics page.

WEEK 1 Review of ML fundamentals – Classification, Regression. Review of probability theory and optimization concepts.
WEEK 2 RL Framework; Supervised learning vs. RL; Explore-Exploit Dilemma; Examples.
WEEK 3 MAB: Definition, Uses, Algorithms, Contextual Bandits, Transition to full RL, Intro to full RL problem
WEEK 4 Intro to MDPs: Definitions , Returns, Value function, Q-function.
WEEK 5 Bellman Equation, DP, Value Iteration, Policy Iteration, Generalized Policy Iteration.
WEEK 6 Evaluation and Control: TD learning, SARSA, Q-learning, Monte Carlo, TD Lambda, Eligibility Traces.
WEEK 7 Maximization-Bias & Representations: Double Q learning, Tabular learning vs. Parameterized, Q-learning with NNs
WEEK 8 Function approximation: Semi-gradient methods, SGD, DQNs, Replay Buffer.
WEEK 9 Policy Gradients: Introduction, Motivation, REINFORCE, PG theorem, Introduction to AC methods
WEEK 10 Actor-Critic Methods, Baselines, Advantage AC, A3C Advanced Value-Based Methods: Double DQN, Prioritized Experience Replay, Dueling Architectures, Expected SARSA.
WEEK 11 Advanced PG/A-C methods: Deterministic PG and DDPG, Soft Actor-Critic (SAC) HRL: Introduction to hierarchies, types of optimality, SMDPs, Options, HRL algorithms POMDPS: Intro, Definitions, Belief states, Solution Methods; History-based methods, LSTMS, Q-MDPs, Direct Solutions, PSR.
WEEK 12 Model-Based RL: Introduction, Motivation, Connections to Planning, Types of MBRL, Benefits, RL with a Learnt Model, Dyna-style models, Latent variable models, Examples, Implicit MBRL. Case study on design of RL solution for real-world problems.
+ Show all weeks

About the Instructors

Balaraman Ravindran
Professor, CSE , IIT Madras

B. Ravindran heads the Robert Bosch Centre for Data Science & Artificial Intelligence (RBCDSAI) at IIT Madras. He is the Mindtree Faculty Fellow, TCS Affiliate Faculty and Professor in the Department of Computer Science and Engineering at IIT Madras.​ He has held visiting positions at the Indian Institute of Science, University of Technology, Sydney, and Google Research. Currently, his research interests span the areas of geometric deep learning and reinforcement learning. He is one of the founding executive committee members of the India chapter of ACM SIGKDD. He is currently serving on the editorial boards of Machine Learning Journal, JAIR, ACM Transactions on Intelligent Systems and Technology, PLOS One, and Frontiers in Big Data and AI. He has published more than 100 papers in premier journals and conferences.​ His work with students have won multiple best paper awards, the most recent being ​the best​ application paper​ at ​PAKDD 202​1​. His video lectures on NPTEL are widely viewed and have received accolades for their depth and delivery. ​He received his PhD from the University of Massachusetts, Amherst and his Master’s degree from the Indian Institute of Science, Bangalore.​ He is a senior member of the Association for Advancement of AI (AAAI) and an ACM Distinguished Member.