Exploration-focused training using the MaxDiff RL algorithm enables robotics AI to handle new tasks immediately by encouraging robots to be randomly adventurous and experience a wide range of states.
•8m read time• From arstechnica.com
Sort:
Exploration-focused training using the MaxDiff RL algorithm enables robotics AI to handle new tasks immediately by encouraging robots to be randomly adventurous and experience a wide range of states.
Sort: