Analysis and Prediction of Web User Interactions Using Time Series Analysis

Open Access
Mohan, Vijay Vyas
Graduate Program:
Electrical Engineering
Master of Science
Document Type:
Master Thesis
Date of Defense:
June 10, 2009
Committee Members:
  • Dr Jim Jansen, Thesis Advisor
  • Bernard James Jansen, Thesis Advisor
  • George Kesidis, Thesis Advisor
  • search log analysis
  • web search
In this research, the methodology of time series analysis is studied and adapted to analyze the temporal facets of individual user interaction with search engines as recorded in search logs.  A massive search engine query log with more than 3.5 million queries over a period of three months is first enhanced with factors which identify each user query by user intent, type of query, and other aspects. Temporal characteristics are used to obtain additional factors such as the elapsed time between query searched and result clicked along with tracking seasonal components like daily and weekly cycles for each query. Two popular approaches to time series analysis are explored – the Box-Jenkins ARIMA method and the regression method. A framework is provided for using the methodology of time series analysis to predict the future actions of the individual user. Time series regression models are obtained for every active user to predict the rank of the results clicked one-step ahead of time. The aggregate statistical analysis of the obtained time series models are used to recognize similarities in user behavior for Web search and identify significant predictors of rank clicks. Predicting Web search engine users’ future actions and analyzing their searching behavior could be very useful for optimizing online advertisements and web service providers.