Designing Efficient and Robust Deep Reinforcement Learning AlgorithmsSpeaker: Longbo Huang – Beijing, China
Topic(s): Artificial Intelligence, Machine Learning, Computer Vision, Natural language processing
AbstractDeep reinforcement learning (DRL) has received much attention and finds successful applications in various important fields, including games, robotics, transportation and science. Despite its continuing success, DRL still faces several major challenges, including accurate value function estimation, improved sample efficiency and efficient practical implementation. In this talk, we will present our recent results on tackling these issues in DRL. (i) Using Boltzman softmax operator for improving single-agent DRL value function estimate. We show that properly incorporating the softmax operator in continuous control helps smooth the optimization landscape, and leads to efficient policy search and optimization. We then present the Softmax Deep Double Deterministic Policy Gradient (SD3) algorithm, which effectively improves the overestimation and underestimation bias and outperforms state-of-the-arm methods. (ii) Using regularization and Softmax for efficient policy search in multi-agent RL (MARL). We first discover a gradient explosion issue suffered by existing methods, which severely affects value function estimation. We then propose a novel Softmax and regularization-based update scheme RES to penalizes large joint action-values that deviate from a baseline and demonstrate its effectiveness in policy learning. (iii) Applying DRL to sustainable computing applications. We develop highly scalable and efficient DRL algorithms for large-scale dockless bike sharing and network optimization problems, which significantly outperform state-of-the-art methods.
About this LectureNumber of Slides: 50
Duration: 45 minutes
Languages Available: English
Request this Lecture
To request this particular lecture, please complete this online form.
Request a Tour
To request a tour with this speaker, please complete this online form.
All requests will be sent to ACM headquarters for review.