Stochastic Accelerated Primal Dual Methods for Minimax Problems

Open Access
- Author:
- Zhang, Xuan
- Graduate Program:
- Industrial Engineering
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- September 12, 2024
- Committee Members:
- Steven Landry, Program Head/Chair
Necdet Aybat, Chair & Dissertation Advisor
Uday Shanbhag, Major Field Member
Mehrdad Mahdavi, Outside Field Member
Constantino Lagoa, Outside Unit & Minor Member - Keywords:
- Minimax Optimization
Nonconvex Optimization
First Order Method
Stochastic Optimization - Abstract:
- Nowadays increasing number of applications arising in machine learning, statistics, finance, and game theory require solving optimization problems with complicated structures, e.g., optimization problems with functional constraints or robust optimization problems, which can be formulated in the minimax form. Some important examples include empirical risk minimization, adversarial deep learning, and distributionally robust learning, and generative adversarial networks. Therefore, there has been a pressing need for more powerful, iterative optimization tools that can handle these minimax problems employing efficient computations in each iteration. In this dissertation, for both nonconvex and convex minimax problems, we have proposed a class of stochastic first-order primal dual (FOPD) algorithms with a novel momentum term such that each iteration requires taking a primal proximal-stochastic gradient step followed by a dual proximal-optimistic stochastic gradient step. The proposed algorithms achieve the best complexity bounds among the existing works in the related literature. Furthermore, we study the fundamental relations between the convergence rate and robustness depending on algorithmic parameters, i.e., primal and dual step sizes, and momentum parameter. Additionally, we incorporate SPIDER variance reduction technique to the proposed FOPD framework and analyze the resulting method. The remainder of my dissertation focuses on developing stochastic decentralized FOPD algorithms. Motivated by the need for efficient scaling, enhanced data privacy, improved system robustness, optimized resource use, and democratized AI development, my research proposed a stochastic distributed FOPD algorithm with variance reduction, achieving the optimal communication and oracle complexity bounds. To the best of our knowledge, this is the first distributed method using a single communication round in each iteration to jointly optimize the oracle and communication complexities.