Fundamental problems in graph learning: optimization, generalization, privacy, and model design

Cong, Weilin

Fundamental problems in graph learning: optimization, generalization, privacy, and model design

Open Access

Author:: Cong, Weilin
Graduate Program:: Computer Science and Engineering (PHD)
Degree:: Doctor of Philosophy
Document Type:: Dissertation
Date of Defense:: November 29, 2023
Committee Members:: Chitaranjan Das, Program Head/Chair
Kamesh Madduri, Major Field Member
Mahmut Kandemir, Major Field Member
Suhang Wang, Outside Unit & Field Member
Mehrdad Mahdavi, Chair & Dissertation Advisor
Keywords:: Graph representation learning
Optimization
Generalization
Abstract:: This dissertation extensively investigates various aspects of Graph Neural Networks (GNNs) in the context of graph representation learning, a field that has made significant strides in practical applications with graph data and has captured substantial interest in the machine learning community. \textbf{Optimization}: We study how to efficiently train GNN models. We propose strategies for neighbor sampling and variance reduction to tackle the computational overhead associated with GNN training. These strategies significantly diminish the number of nodes required for training. We also delve into distributed learning for GNNs, which enables the cooperative training of a single model across multiple machines while minimizing communication overhead. \textbf{Generalization}: We study the issue of performance degradation in deep GNN models during training, which is often attributed to over-smoothing. Contrary to common beliefs, our study reveals that over-smoothing does not necessarily occur in practice, and that properly trained, deeper models can exhibit high training accuracy. However, these deeper models often demonstrate poor generalization during the testing phase. By scrutinizing the generalization capabilities of GNNs, we reveal that the strategies used to achieve high training accuracy can significantly impair the GNNs' generalization capabilities. This insight offers a fresh perspective on the performance degradation issue in deep GNNs. \textbf{Privacy}: As privacy protection gains prominence, the need to unlearn the effects of a specific node from a pre-trained graph learning model has also grown. However, due to node dependencies in graph-structured data, representation unlearning in GNNs presents substantial challenges and is under-explored. To bridge this gap, we propose graph unlearning methods capable of effectively mitigating node dependency issues, ensuring that the unlearned model parameters contain no information about the unlearned node features, backed by theoretical guarantees. \textbf{Model design.} We explore the neural architecture design for temporal graph learning, with applications in areas such as user-products or user-ads recommender systems. We aim to establish a neural architecture that can capture temporal evolutionary patterns and accurately predict node properties and future links.

Tools