Course Launching Spring 2026 — This website is under active development. Content and materials are being continuously updated.

Course Implementations & Demos

Demo animation
will go here
RL FOUNDATIONS

GridWorld Value Iteration

Tabular value iteration and policy visualization (Module 2)

Coming soon
Demo animation
will go here
DEEP RL

CartPole with DQN

DQN variations using Stable-Baselines3 (Module 3)

Coming soon
Demo animation
will go here
POLICY GRADIENTS

PPO on MiniGrid

Proximal Policy Optimization implementation (Module 4)

Coming soon
Demo animation
will go here
LANGUAGE MODELS

RLHF for Toy LLM

Preference collection and reward model training (Module 7)

Coming soon
Demo animation
will go here
OFFLINE RL

Decision Transformer

Sequence modeling approach implementation (Module 8)

Coming soon
Demo video
will go here
MODEL-BASED RL

DreamerV3 World Model

World model learning and planning (Module 9)

Coming soon
Demo animation
will go here
HIERARCHICAL RL

Hierarchical RL Agent

Options framework implementation (Module 10)

Coming soon
Demo animation
will go here
MULTI-AGENT RL

PettingZoo Multi-Agent

Cooperative and competitive agent training (Module 12)

Coming soon
Demo video
will go here
ROBOTICS

Sim2Real Transfer

PyBullet/MuJoCo simulation to real world (Project 3)

Coming soon