A deep dive into the Hamilton-Jacobi-Bellman (HJB) equation, tracing its roots from Bellman's 1952 dynamic programming work through continuous-time reinforcement learning to modern diffusion models. Covers the derivation of HJB for both deterministic and stochastic controlled diffusions, introduces the continuous-time
Table of contents
1. Introduction #2. Continuous-time Reinforcement Learning #3. Diffusion Models #References #Appendix A: LQR Derivation #Appendix B: Merton Derivation #Appendix C: Non-autonomous and Finite-Horizon Cases #Sort: