A step-by-step introduction to Reinforcement Learning using Q-Learning, implemented in C# with the Unity game engine. Covers core RL concepts including policies, the Bellman Equation, Q-values, temporal difference learning, and the exploration-exploitation trade-off with ε-greedy strategies. Builds a navigating robot example on a 2D grid, with full code samples and a GitHub repository. Concludes with a brief overview of the broader RL algorithm landscape including DQN.
Table of contents
What is Reinforcement LearningNavigating Robot ExampleThe Bellman EquationSolving the Bellman Equation IterativelyAction Quality (Q-Values)Exploration vs. Exploitation (ε-Greedy)The Broader RL EcosystemSort: