Reconceptualizing Machine Learning Reproducibility
Open Access
- Author:
- Kou, Tianqi
- Graduate Program:
- Informatics
- Degree:
- Master of Science
- Document Type:
- Master Thesis
- Date of Defense:
- November 11, 2022
- Committee Members:
- Fred Fonseca, Thesis Advisor/Co-Advisor
Priya Kumar, Committee Member
Jeffrey Bardzell, Program Head/Chair
Daniel Susser, Thesis Advisor/Co-Advisor - Keywords:
- Machine Learning
Reproducibility
Philosophy of Machine Learning - Abstract:
- Reproducibility is broadly interpreted as the chance of getting the same results through a re-run of the original study in a reproduction study. The concept has been adopted by many areas of research as an important criterion to evaluate the quality of research and the validity of research claims. Reproducibility is a sign of the stability of finding and is treated as a surrogate of truth. I use reproducibility to refer to the chance of reproducing the same results, rather than the chance of reproducing the same experiment. In chapter one, I will give the reader an introduction to the concept of reproducibility and show the motivation of this thesis. After chapter one, readers should have the knowledge of why reproducibility is important to scientists and its diverse functions conceptualized by scientists. The concept also has complications in terms of its limitation, contextuality, and operationalization. I will also demonstrate why ML researchers care about reproducibility and what an ML reproduction study looks like. I identify the gap of missing philosophical reflections on ML reproducibility. I will show that bridging the gap requires situated analyses in ML research; existing reflections on the concept at a general level are insu�cient. In chapter two, I aim to show that the ML community conceptualizes ML research as standardized experiments and adopts direct reproducibility as the criterion to evaluate ML research. To achieve the above goal, I analyze existing definitions of ML reproducibility and show how they are insu�cient to fulfill the purported functions of the definition of reproducibility. After that, I characterize a good definition that can fulfill the purported functions. I conclude the chapter with the characterization of the ML community’s conceptualization of ML reproducibility as direct reproducibility. In chapter three, I argue that there is an incompatibility between ML research and direct reproducibility. To do that, I argue that there are three fundamental instabilities in ML research. I will also show in contextual analyses how direct reproducibility can be insu�cient to support research claims or even be counterproductive to the epistemic goals of a study. I conclude the chapter with suggestions for academic reform and directions for future research.