Security Study of Ranking Algorithms
Open Access
- Author:
- Noureldeen, Ali
- Graduate Program:
- Computer Science and Engineering
- Degree:
- Master of Science
- Document Type:
- Master Thesis
- Date of Defense:
- May 30, 2019
- Committee Members:
- Sencun Zhu, Thesis Advisor/Co-Advisor
Danfeng Zhang, Committee Member
Chitaranjan Das, Program Head/Chair - Keywords:
- Ranking Algorthims
Machine Learning - Abstract:
- Rankings, ratings, and reviews are properties that play an important role in social media networks and platforms. Many users base their decision solely on such properties before downloading an application or participating on different social media platforms. These different social media platforms which rely on such properties to determine the most relevant results have become a target for attackers. Ranking algorithms that these platforms rely on are usually hidden from the public to preserve the integrity of the platform. For the attackers, their main goal is to manipulate the integrity of such platforms by exploiting vulnerabilities. For instance, assuming that the model used for ranking is compromised, attackers can promote their own agenda and fake products in search results. In efforts to mitigate such vulnerabilities, this thesis provides a study on various ranking systems across different platforms. Specifically, it focuses on the analysis of Reddit's platform to model the Reddit ranking algorithm using machine learning. The modeling starts first by crawling Reddit's popular hot sorting page. Next, after processing the data, we use a simple feed forward neural network to predict a model. Based on our generalized model and without a ground truth for comparison, the results show that there are three factors that can affect the ranking algorithm with high probability. These three factors are the number of likes, the number of comments, and the post duration which dictates the age of the post in hours. Finally, this work proposes a way to prevent these attacks by adding a factor to the ranking algorithm that filters out possible attackers from actual users. This added factor helps to make the ranking algorithm more robust against such attacks, which in turn preserves the integrity of the platform.