Assessing Automatic Gender Recognition (AGR) Algorithms for Global Volunteered Geographic Information (VGI) Gender-Disaggregation and Gendered Spatial Analysis

Open Access
- Author:
- Anderson, Brigit
- Graduate Program:
- Spatial Data Science
- Degree:
- Master of Science
- Document Type:
- Master Thesis
- Date of Defense:
- October 14, 2024
- Committee Members:
- Fritz Connor Kessler, Thesis Advisor/Co-Advisor
Meghan Kelly, Committee Member
Anthony Robinson, Program Head/Chair - Keywords:
- VGI
gender - Abstract:
- There is a large body of research showing that there is a gender contribution gap in who submits to Volunteered Geographic Information (VGI), in that male contributions outnumber females. Less is known, however, on the spatial nature of the gender gap in VGI. The overall research question was does gender and global location play a role in how VGI contributions are distributed? A roadblock to answering this question was the lack of publicly available gender-disaggregated VGI or (in its absence) a repeatable methodology for gender-disaggregation. This research, therefore, focused on the application of Automated Gender Recognition (AGR) tools for gender-disaggregating digital data. To address the primary question – does gender and global location play a role in how VGI contributions are distributed – it was necessary to consider how this assessment may be impacted according to which gender-disaggregation tool is used. This was achieved by breaking down the primary research question to three goals: (1) Compare whether the indices of the following AGR tools vary by gender and country location when using a dataset with known gender: 1) GenderAPI, 2) genderize.io, 3) gender_guesser Python package, and 4) Namsor; (2) Conduct an exploratory data analysis (EDA) comparing the gender predictions of VGI contributions without a known gender by location using the same AGR tools; (3) Make recommendations on the selection and application of different AGR tools for gender-disaggregating VGI contributions. Using performance indices from Wais (2016), the overall findings were that Namsor, GenderAPI, and genderize.io all had strengths in their performance based on either location or data characteristics. Gender-guesser’s only strength was that it was the only FOSS tool. All of the AGRs, however, fell short of global applicability and usefulness in gender disaggregating VGI because of their inability to classify genders outside of binary female/male classifications and their shown Western bias. These limitations, however, do not negate that these tools are still preferable as a repeatable method for gender-disaggregating VGI. Acknowledging the limitations, in the absence of other gender-disaggregation tools that incorporate non-binary gender classifications these tools were used to address the primary research question and used to disaggregate the global VGI dataset from eBird showing birding observations across the world. Using these tools, the primary research question of does gender and global location play a role in how VGI contributions are distributed was answered. Each tool resulted in the general global trend of male contributions outweighing female contributions. This project has both research and practical implications. In the research realm, there is a clear need for a repeatable and tested methodology to gender-disaggregated VGI. This has been the first study to look at AGR performance indices country by country. In the practical realm, these methods can be used to chisel away at some of the harm that using VGI in decision-making can inflict without understanding who is contributing to VGI, who is telling the story, and very importantly with geographic data – where that story is being told.