Explainable AI Techniques for Security

Open Access
- Author:
- Guo, Wenbo
- Graduate Program:
- Information Sciences and Technology
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- June 28, 2022
- Committee Members:
- Mary Beth Rosson, Program Head/Chair
Lin Lin, Outside Unit & Field Member
Ting Wang, Co-Chair, Major Member & Dissertation Advisor
C Lee Giles, Major Field Member
Xinyu Xing, Co-Chair & Dissertation Advisor - Keywords:
- computer security
machine learning
explainable AI
deep learning - Abstract:
- While deep learning has shown great potential in automating security analysis, its opaque nature has primarily restricted its deployment in real-world security applications. Existing research in the machine learning community has developed techniques to explain the decisions of deep learning classifiers. Unfortunately, these methods are mainly optimized for non-security applications (e.g., computer vision and natural language processing) and cannot retain their effectiveness in security applications due to the following limitations. First, adversary-intensive is a key attribute of security applications. Existing explanation methods are vulnerable to adversarial attacks and thus raise severe robustness concerns when applied to security applications. Second, the critical assumptions of most existing explanation methods are often violated in security applications, making them unsuitable for security applications or leading to poor explanation fidelity. Third, besides supervised classifiers, security analysts utilize more advanced learning paradigms (e.g., deep reinforcement learning and self-supervised learning) to facilitate sophisticated security analysis. Existing techniques designed for classifiers are not capable of explaining these advanced learning paradigms. Finally, security analysts are required to uncover and patch model vulnerabilities. Existing works only focus on deriving explanations without providing methods for explanation-guided model remediation. In this dissertation, I develop a series of explanation-related techniques to tackle the above limitations and thus establish explainability for different types of deep learning models widely adopted in security applications. Specifically, I first present DANCE, an ensemble learning technique that robustifies existing explanation methods against adversarial attacks. Following that, LEMNA introduces a high-fidelity explanation method for deep learning-based security classifiers that otherwise cannot be handled by existing explanation methods. Going beyond supervised classifiers, CADE and EDGE further derive explanations for two advanced learning paradigms used in security applications -- self-supervised learning and deep reinforcement learning. Finally, I discuss the utility of the above explanation methods in patching model errors. This dissertation research realizes the first explanation and remediation framework for deep learning-based security applications. Utilizing this framework, security analysts could establish trust in deep learning model decisions, take rapid actions when models make mistakes, and even discover new knowledge from the models that otherwise cannot be acquired by manual summarization.