The Good, the Bad and the Ugly: Exploring the Robustness and Applicability of Adversarial Machine Learning
Open Access
- Author:
- Li, Xiaoting
- Graduate Program:
- Information Sciences and Technology
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- March 24, 2022
- Committee Members:
- Ting Wang, Major Field Member
C Lee Giles, Major Field Member
Dinghao Wu, Chair & Dissertation Advisor
Jinchao Xu, Outside Unit & Field Member
Mary Beth Rosson, Program Head/Chair - Keywords:
- Adversarial machine learning
Attribute privacy
Social good
Defense
Adversarial attacks - Abstract:
- Neural networks have been widely adopted to address different real-world problems. Despite the remarkable achievements in machine learning tasks, they remain vulnerable to adversarial examples that are imperceptible to humans but can mislead the state-of-the-art models. More specifically, such adversarial examples can be generalized to a variety of common data structures, including images, texts and networked data. Faced with the significant threat that adversarial attacks pose to security-critical applications, in this thesis, we explore the good, the bad and the ugly of adversarial machine learning. In particular, we focus on the investigation on the applicability of adversarial attacks in real-world scenarios for social good and their defensive paradigms. The rapid progress of adversarial attacking techniques aids us to better understand the underlying vulnerabilities of neural networks that inspires us to explore their potential usage for good purposes. In real world, social media has extremely reshaped our daily life due to their worldwide accessibility, but its data privacy also suffers from inference attacks. Based on the fact that deep neural networks are vulnerable to adversarial examples, we attempt a novel perspective of protecting data privacy in social media and design a defense framework called Adv4SG, where we introduce adversarial attacks to forge latent feature representations and mislead attribute inference attacks. Considering that text data in social media shares the most significant privacy of users, we investigate how text-space adversarial attacks can be leveraged to protect users' attributes. Specifically, we integrate social media property to advance Adv4SG, and introduce cost-effective mechanisms to expedite attribute protection over text data under the black-box setting. By conducting extensive experiments on real-world social media datasets, we show that Adv4SG is an appealing method to mitigate the inference attacks. Second, we extend our study to more complex networked data. Social network is more of a heterogeneous environment which is naturally represented as graph-structured data, maintaining rich user activities and complicated relationships among them. This enables attackers to deploy graph neural networks (GNNs) to automate attribute inferences from user features and relationships, which makes such privacy disclosure hard to avoid. To address that, we take advantage of the vulnerability of GNNs to adversarial attacks, and propose a new graph poisoning attack, called AttrOBF to mislead GNNs into misclassification and thus protect personal attribute privacy against GNN-based inference attacks on social networks. AttrOBF provides a more practical formulation through obfuscating optimal training user attribute values for real-world social graphs. Our results demonstrate the promising potential of applying adversarial attacks to attribute protection on social graphs. Third, we introduce a watermarking-based defense strategy against adversarial attacks on deep neural networks. With the ever-increasing arms race between defenses and attacks, most existing defense methods ignore fact that attackers can possibly detect and reproduce the differentiable model, which leaves the window for evolving attacks to adaptively evade the defense. Based on this observation, we propose a defense mechanism that creates a knowledge gap between attackers and defenders by imposing a secret watermarking process into standard deep neural networks. We analyze the experimental results of a wide range of watermarking algorithms in our defense method against state-of-the-art attacks on baseline image datasets, and validate the effectiveness our method in protesting adversarial examples. Our research expands the investigation of enhancing the deep learning model robustness against adversarial attacks and unveil the insights of applying adversary for social good. We design Adv4SG and AttrOBF to take advantage of the superiority of adversarial attacking techniques to protect the social media user's privacy on the basis of discrete textual data and networked data, respectively. Both of them can be realized under the practical black-box setting. We also provide the first attempt at utilizing digital watermark to increase model's randomness that suppresses attacker's capability. Through our evaluation, we validate their effectiveness and demonstrate their promising value in real-world use.