An Algebraic Perspective on Computing with Data
Open Access
- Author:
- Wright, William C
- Graduate Program:
- Mathematics
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- June 04, 2019
- Committee Members:
- Jason Ryder Morton, Dissertation Advisor/Co-Advisor
Jason Ryder Morton, Committee Chair/Co-Chair
Vladimir Itskov, Committee Member
Alexei Novikov, Committee Member
Aleksandra B Slavkovic, Outside Member - Keywords:
- Mathematics
Statistics
Sheaf Theory
Category Theory
Giry Monad
Algebraic Statistics
Topos Theory - Abstract:
- Historically, algebraic statistics has focused on the application of techniques from computational commutative algebra, combinatorics, and algebraic geometry to problems in statistics. In this dissertation, we emphasize how sheaves and monads are important tools for thinking about modern statistical computing. First, we explore how probabilistic computing necessitates thinking about random variables as tied to their family of extensions and ultimately reformulate this observation in the language of sheaf theory. We then turn our attention to the relationship between topos theory and relational algebra of databases showing how Codd’s original operations can be seen as constructions inside Set. Next we discuss contextuality, the phenomenon whereby the value of a random variable depends on the other random variables observed simultaneously, and demonstrate how sheaves allow us to lift statistical concepts to contextual measurement scenarios. We then discuss a technique for hypothesis testing based on algebraic invariants whose asymptotic convergence properties do not rely on asymptotitc normality of any estimator as they are defined as energy functionals on the observed data. Finally, we discuss the Giry monad and how its implementation would aid in analysis of data sets with missing data.