Topics on Nonconvex Learning

Open Access
- Author:
- Liu, Bingyuan
- Graduate Program:
- Statistics
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- February 23, 2021
- Committee Members:
- Lingzhou Xue, Dissertation Advisor/Co-Advisor
Runze Li, Committee Member
Bing Li, Committee Member
Yanyuan Ma, Committee Member
Zihan Zhou, Outside Member
Ephraim Mont Hanks, Program Head/Chair
Lingzhou Xue, Committee Chair/Co-Chair - Keywords:
- Nonconvex optimization
machine learning
statistical learning
deep learning - Abstract:
- Many machine learning models need to solve nonconvex and nonsmooth optimization problems. Compared with convex optimization, nonconvex optimization captures the intrinsic structure of the learning problem more accurately. But, there are usually no well-developed algorithms with convergence guarantees for solving nonconvex and nonsmooth optimization problems. This thesis investigates how to design efficient algorithms with convergence guarantees and establish statistical properties for the computed solutions in these nonconvex learning problems. In the first part of this thesis, we study three nonconvex high-dimensional statistical learning problems. In chapter 3, we propose a robust high-dimensional regression estimator with coefficient thresholding. The coefficient thresholding is imposed in the loss function to handle the strong dependence between predictors but leads to a nonconvex loss function. We propose an efficient composite gradient descent algorithm to solve the optimization with convergence guarantee and prove the estimation consistency of our proposed estimator. In chapter 4, we propose a sparse estimation of semiparametric covariate-adjusted graphical models. In chapter 5, we study sparse sufficient dimension reduction estimators. We study the theoretical property of nonconvex penalize estimators for both chapters and propose nonconvex ADMM algorithms to solve them with computational guarantees efficiently. In the second part of this thesis, we study nonconvex neural network models. First, we study the loss landscape of attention mechanisms, which is a widely used module in deep learning. Theoretically and empirically, we show that neural network models with attention mechanisms have lower sample complexity, better generalization, and maintain a good loss landscape structure. Second, we propose a novel neural network layer that improved model robustness against adversarial attacks through neighborhood preservation. We show that despite a highly nonconvex nature, our layer has a lower Lipschitz bound, thus more robust against adversarial attacks.