A groundbreaking approach called in-context exploration-exploitation (ICEE) is presented in this post, which allows for real-time adaptation to users' preferences through neural network inference. ICEE is able to perform Bayesian belief updates without the need for explicit Bayesian inference and can solve new RL tasks within a few episodes.

6m read time From research.atspotify.com
Post cover image
Table of contents
Return conditioned In-context Policy LearningExperimentsConclusions

Sort: