Understanding and Detecting Online Misinformation with Auxiliary Data

Open Access
- Author:
- Cui, Limeng
- Graduate Program:
- Information Sciences and Technology
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- February 15, 2022
- Committee Members:
- Mary Beth Rosson, Professor in Charge/Director of Graduate Studies
Dongwon Lee, Chair & Dissertation Advisor
Lin Lin, Outside Unit & Field Member
Amulya Yadav, Major Field Member
Anna Squicciarini, Major Field Member - Keywords:
- Misinformation detection
Graph neural networks
Knowledge graph
Social media
Fraud detection
Active learning - Abstract:
- The Internet provides great convenience for users to access, create, and share diverse information and promotes the spread of misinformation. The cheap to produce, easily accessible fake content online can easily shape public perception and cause detrimental societal effects. Thus, how to effectively detect online misinformation and attenuate its effect has gained much attention in recent years. Recent achievements of misinformation detection methods have shown promising results. However, there still exhibits enormous challenges due to the multi-modality, interpretability, and costs of human annotation in this problem. In order to address the above-mentioned issues, we can leverage various types of information from different perspectives. For example, user engagements over news articles, including posts and comments, contain justification about the news article. Since these auxiliary data can provide rich contextual information for more accurate and interpretable detection, it is essential to understand and detect misinformation by integrating multiple sources. This task is challenging because the proposed methods should be able to exploit auxiliary supervision for learning with limited data and effectively detect misinformation. In this regard, three different scenarios related to detecting and understanding online misinformation are discussed in this dissertation. First, the rich information available in user comments on social media suggests that we could investigate whether the latent sentiments hidden in user comments can help distinguish fake news from reliable content. A sentiment-aware fake news detection method is proposed to account for users' latent sentiments. Second, users' lack of sufficient prior knowledge suggests the misinformation detection method to reflect the interpretability of the results than prediction labels. A knowledge-guided model is proposed to solve the challenging limitation on social contexts in domains like healthcare. Third, human labeling is time-consuming and costly. This problem is further exacerbated in misinformation detection scenarios, when the datasets are imbalanced. A novel active learning framework is proposed to improve the model performance at a lower cost in detecting fraudsters in online websites. Finally, this work is closed by future directions on intervening the dissemination of misinformation at an early stage. When the labeled data is limited in a new genre or language, transferring the knowledge from high-resource domains to the new, low-resource domain is a promising solution. The findings of this dissertation significantly expand the boundaries of online misinformation detection and inspire improvements on general machine learning methods.