Safe machine learning for intelligent multi-robot systems
Open Access
- Author:
- Yuan, Zhenyuan
- Graduate Program:
- Electrical Engineering (PHD)
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- December 01, 2023
- Committee Members:
- Madhavan Swaminathan, Program Head/Chair
Minghui Zhu, Chair & Dissertation Advisor
Alan Wagner, Outside Unit & Field Member
Constantino Lagoa, Major Field Member
Ying Sun, Major Field Member - Keywords:
- Machine learning
Multi-robot systems
distributed learning
reinforcement learning
safe machine learning
Gaussian process regression
motion planning
generalization
learning and control - Abstract:
- Recent advances in embedded computing and mobile sensing have led to pervasive use of robotic systems in both civil and military applications. With single autonomous robots for particular tasks widely accepted and used in a number of occasions and the development of high-speed communication technologies, there are attempts to connect the robots together and make them work collaboratively as a team. A key element that enhances the autonomy and intelligence of these robotic systems is machine learning. However, recent accidents associated with machine learning-enabled robots indicate that machine learning remains unsafe. This dissertation is concerned with safe machine learning in intelligent multirobot systems; that is, developing a set of algorithms which multi-robot systems can utilize to improve system performances and remain safe. The research agenda will be developed from the following aspects. The dissertation starts from the fundamental problem of distributed learning with uncertainty quantification in multi-robot systems. In particular, we consider the problem where a group of agents aim to collaboratively learn a common latent function through streaming data. We propose a class of lightweight distributed Gaussian process regression algorithms that explicitly considers the limited budget in memory, computation, and communication in robotic systems. We show that communication brings Pareto improvement to the agents in the network by investigating the transient and the steady-state performances of the proposed algorithms. We next show how to integrate the learning algorithm developed above with motion planning to ensure robot safety during the entire online learning process. In particular, we propose a learning and planning framework to solve safe navigation problems in uncertain environments or under uncertain dynamics. We further derive the sufficient conditions to ensure the safety of the system. Then we consider the problem of zero-shot generalization in reinforcement learning. In particular, we consider the problem of multiple learners collaboratively learning a single control policy which is able to perform well without data collection and policy adaptation in new environments. We formulate the problem as a federated optimization problem with an unknown objective function. We propose a class of federated optimization algorithms which leverages on zero-shot generalization guarantees. We further derive theoretical guarantees on almost-sure convergence, almost consensus, Pareto improvement and global convergence. Finally, we investigate how a robot can quickly adapt its control policy online by incrementally leveraging its previous learning experiences. Specifically, we study online meta reinforcement learning on physical agents. We propose a novel online meta update method and a policy masking framework. The policy masking framework ensures all-time safety, while the online meta update method is sample-efficient and is able to achieve sublinear growth of dynamic regret.