Researchers from Peking University and King’s College London developed a decentralized policy optimization framework for multi-agent systems, improving scalability and decision-making efficiency in large-scale AI systems by reducing communication and system complexity. The framework uses model learning to enhance policy optimization with limited data and employs localized models for accurate state and reward predictions. Tested in diverse scenarios like transportation and power systems, it demonstrated superior performance, significantly reducing communication costs while improving convergence and sample efficiency. This scalable MARL framework shows potential for applications in advanced traffic, energy, and pandemic management.

4m read timeFrom marktechpost.com
Post cover image

Sort: