Causal Discovery from Relational Data: Theory and Practice

Open Access
- Author:
- Lee, Sanghack
- Graduate Program:
- Information Sciences and Technology
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- January 22, 2018
- Committee Members:
- Vasant Gajanan Honavar, Dissertation Advisor/Co-Advisor
Vasant Gajanan Honavar, Committee Chair/Co-Chair
Clyde Lee Giles, Committee Member
John Yen, Committee Member
Bharath Kumar Sriperumbudur, Outside Member - Keywords:
- Causality
Causal Model
Graphical Models
Relational Data
Relational Model - Abstract:
- Discovery of causal relationships from observational and experimental data is a central problem with applications across multiple areas of scientific endeavor. There has been considerable progress over the past decades on algorithms for eliciting causal relationships through a set of conditional independence queries from data. Much of this work assumes that the data instances are independent and identically distributed (iid). However, in many real-world applications, because the underlying data exhibits a relational structure of the sort that is modeled in practice by an entity-relationship model, the iid assumption is violated. Motivated by the limitations of traditional approaches to learning causal relationships from relational data, a relational causal model is recently introduced. The key idea behind the relational causal model is that a cause and its effects are in a direct or indirect relationship that is reflected in the relational data. Traditional approaches for reasoning with and learning causal models from iid data cannot be trivially applied in the relational setting. Against this background, this dissertation investigates a set of closely related research problems having to do with causal inference with relational data: (i) characterizing the conditional independence relations that hold in a given relational causal model, (ii) sound and complete learning of the structure of a relational causal model using an independence oracle, (iii) measuring the strength of conditional dependence and testing conditional independence among relational variables from relational data, and (iv) robustly learning the structure of a relational causal model from relational data.