Mining users' self-privacy violations in online public discourse using data science techniques
Open Access
- Author:
- Umar, Prasanna
- Graduate Program:
- Information Sciences and Technology
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- December 02, 2021
- Committee Members:
- Andrea Tapia, Major Field Member
Christopher Griffin, Outside Unit & Field Member
Sarah Rajtmajer, Major Field Member
Anna Squicciarini, Chair & Dissertation Advisor
Mary Beth Rosson, Program Head/Chair - Keywords:
- online self-disclosure
public discourse
privacy
automated detection - Abstract:
- User engagement in online public discourse often entails self-disclosure - an intentional behavior of sharing personal information with others. Users self-disclose in pursuit of various strategic goals and benefits, both intrinsic and extrinsic. Intrinsically, self-disclosure serves therapeutic functions while extrinsic rewards include social and relational benefits such as social connectedness, validation, relational development, and so on. Personal information released on online public platforms (e.g. commentaries) becomes part of shared knowledge and is subject to detrimental uses by advertisers and malicious parties. This behavior, therefore, poses risks to users’ online privacy. Yet, users are often unaware of the sheer amount of personal information they share across online forums, commentaries, and websites, even beyond social network sites. We note that while work on self-disclosure in seemingly bounded environments (e.g. social networks) has shed light on users’ motivations for self-disclosure and contextual influences on such behavior, less is known about user disclosures in public commentaries. In this dissertation, we examine self-disclosure in online public commentaries and address the research gap. Specifically, we devise ways to automatically detect online self-disclosure from users’ contents and then, contextualize users’ disclosures through the study of multiple antecedents to, motivations for, and patterns of self-disclosing behavior. First, we propose a rule-based method of automated self-disclosure detection using a news discourse dataset. We also elucidate the effects of three hypothesized antecedents of self-disclosing behavior: peer influence (reciprocity), topic of conversation, and anonymity. Our results support the role of anonymity and peer influence in eliciting self-disclosure responses from online users. We find that self-disclosure varies across different topical contexts. Second, we detail a generalizable supervised approach to self-disclosure detection that outperforms existing detection methods. Using multiple public discourse datasets, we highlight users’ possible attempts at managing privacy risks through alignment with group disclosure norms both in content and self-disclosure rate. Results of our analyses show that users can stand out in conversations because of disproportionate self-disclosure patterns and dissimilar disclosure content. Finally, we analyze the temporal evolution of self-disclosure patterns in public Twitter interaction networks as users maintain persistent social connections for therapeutic, social, and relational benefits amidst the uncertainties of COVID-19 pandemic. We note heightened self-disclosure with the evolution of the pandemic and shifts in users’ privacy perceptions.