An Analysis of Factors Predicting Click Through Rate During Web Searching Using Neural Networks

Open Access
Author:
Zhang, Ying
Graduate Program:
Industrial Engineering
Degree:
Master of Science
Document Type:
Master Thesis
Date of Defense:
None
Committee Members:
  • Bernard James Jansen, Thesis Advisor
  • Enrique Del Castillo, Thesis Advisor
Keywords:
  • RBFN
  • MLPN
  • transactional log
  • neural network
  • search engine
Abstract:
In this research, I investigate the interactions between users and search engines during Web searching using Neural Networks (NNs) and data from two Web search engine transactional logs. The goal of this research is to identify factors that significantly affect the click through rate (CTR) of Web searchers. CTR, a metric for measuring customer satisfaction, is calculated by dividing the number of links that a searcher clicks by the number of the links that are delivered. The purpose of this research is to leverage user-system interaction data to improve CTRs and to design more efficient ranking and searching algorithms. The analysis in this research consists of two parts: basic data analysis and extended data analysis. Basic data analysis shows that users spend less effort on retrieving their information needs later in the day and on the weekend. Most people prefer using the Internet Explorer browser, selecting informational links, and searching the Web vertically. In the extended data analysis, two NNs are utilized: Multilayer Backpropagation Neural Network (MLPN) and Radial Basis Function Neural Network (RBFN). We investigate the characteristics and quality of these two kinds of networks by screening out the nonlinear behavior in the search logs. We then use the MLPN, by virtue of its higher efficiency compared with the RBFN, to detect the influence of significant information on the CTR. Our findings show that the number of organic links viewed, the type of vertical, and the type of user intent have a negative correlation with the CTR. However, we find that the number of queries reformulated by users, searching time, average query length, type of browser, and the rank of links sum is shown in the Searching Result Page (SERP). All are positively correlated with the CTR. Based on these results, some recommendations for improving the performance of the searching algorithms are provided. Implications of current research will present a new direction for future research.